[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: autoconf test for finding UTF-8 locale?
From: |
Werner LEMBERG |
Subject: |
Re: autoconf test for finding UTF-8 locale? |
Date: |
Mon, 21 Nov 2022 12:20:01 +0000 (UTC) |
>> I'm searching for an autoconf test that checks whether a 'neutral'
>> locale with UTF-8 encoding is available. [...]
> You can do so in a way similar to the Gnulib-provided macros
> gt_LOCALE_FR_UTF8 [1]
> gt_LOCALE_TR_UTF8 [2]
> or the gettext internal macro
> gt_LOCALE_DE_UTF8 [3]
>
> E.g. replace 'French_France'/'Turkish_Turkey'/'German_Germany' with
> 'English_United States'.
Thanks! I have attached my current results – completely untested, and
I'm rather sure that it won't work...
Parts marked with 'XXX' are places where I am completely clueless what
to do.
Finally, I wonder how this could be tested on various platforms. Is
there an OS 'farm' to which a configure script could be sent,
collecting all results of them?
Werner
dnl locale-c-utf8.m4 -*-shell-script-*-
dnl Copyright (C) 2003, 2005-2018, 2022 Free Software Foundation, Inc.
dnl
dnl This file is free software; the Free Software Foundation
dnl gives unlimited permission to copy and/or distribute it,
dnl with or without modifications, as long as this notice is preserved.
dnl From Bruno Haible and Werner Lemberg.
dnl Find a 'C.UTF-8' locale encoding.
dnl This file is based on `locale-de.m4` from 'gnulib'.
AC_DEFUN([LOCALE_C_UTF8],
[
AC_REQUIRE([AM_LANGINFO_CODESET])
AC_CACHE_CHECK([for a 'C.UTF-8' locale], [ac_cv_locale_c_utf8], [
AC_LANG_CONFTEST([AC_LANG_SOURCE([[
#include <locale.h>
#include <time.h>
#if HAVE_LANGINFO_CODESET
# include <langinfo.h>
#endif
#include <stdlib.h>
#include <string.h>
struct tm t;
char buf[16];
int main () {
/* On BeOS and Haiku, locales are not implemented in libc. Rather, libintl
imitates locale dependent behaviour by looking at the environment
variables, and all locales use the UTF-8 encoding. */
#if !(defined __BEOS__ || defined __HAIKU__)
/* Check whether the given locale name is recognized by the system. */
# if defined _WIN32 && !defined __CYGWIN__
/* On native Windows, setlocale(category, "") looks at the system settings,
not at the environment variables. Also, when an encoding suffix such
as ".65001" or ".54936" is specified, it succeeds but sets the LC_CTYPE
category of the locale to "C". */
if (setlocale (LC_ALL, getenv ("LC_ALL")) == NULL
|| strcmp (setlocale (LC_CTYPE, NULL), "C") == 0)
return 1;
# else
if (setlocale (LC_ALL, "") == NULL) return 1;
# endif
/* Check whether nl_langinfo(CODESET) is nonempty and not "ASCII" or "646".
On Mac OS X 10.3.5 (Darwin 7.5) in the de_DE locale, nl_langinfo(CODESET)
is empty, and the behaviour of Tcl 8.4 in this locale is not useful.
On OpenBSD 4.0, when an unsupported locale is specified, setlocale()
succeeds but then nl_langinfo(CODESET) is "646". In this situation,
some unit tests fail. */
# if 0 && HAVE_LANGINFO_CODESET
/* XXX: How shall this look like for 'C.utf8' or 'en_US.UTF-8'? */
{
const char *cs = nl_langinfo (CODESET);
if (cs[0] == '\0' || strcmp (cs, "ASCII") == 0 || strcmp (cs, "646") == 0)
return 1;
}
# endif
# ifdef __CYGWIN__
/* On Cygwin, avoid locale names without encoding suffix, because the
locale_charset() function relies on the encoding suffix. Note that
LC_ALL is set on the command line. */
if (strchr (getenv ("LC_ALL"), '.') == NULL) return 1;
# endif
/* XXX How can I test that UTF-8 encoding actually works? */
#endif
return 0;
}
]])])
if AC_TRY_EVAL([ac_link]) && test -s conftest$ac_exeext; then
case "$host_os" in
# Handle native Windows specially, because there setlocale() interprets
# "ar" as "Arabic" or "Arabic_Saudi Arabia.1256",
# "fr" or "fra" as "French" or "French_France.1252",
# "ge"(!) or "deu"(!) as "German" or "German_Germany.1252",
# "ja" as "Japanese" or "Japanese_Japan.932",
# and similar.
mingw*)
if (LC_ALL=.65001 \
LC_TIME= \
LC_CTYPE= \
./conftest; exit) 2>/dev/null; then
ac_cv_locale_c_utf8=.65001
# Test for the hypothetical native Windows locale name.
# XXX Shouldn't this be rather 'English_US.65001'?
elif (LC_ALL="English_United States.65001" \
LC_TIME= \
LC_CTYPE= \
./conftest; exit) 2>/dev/null; then
ac_cv_locale_c_utf8="English_United States.65001"
else
# None found.
ac_cv_locale_c_utf8=none
fi
;;
*)
if (LC_ALL=C \
./conftest; exit) 2>/dev/null; then
ac_cv_locale_c_utf8=C
# Setting LC_ALL is not enough. Need to set LC_TIME to empty, because
# otherwise on Mac OS X 10.3.5 the LC_TIME=C from the beginning of the
# configure script would override the LC_ALL setting. Likewise for
# LC_CTYPE, which is also set at the beginning of the configure
script.
# Test for the usual locale name.
elif (LC_ALL=en_US \
LC_TIME= \
LC_CTYPE= \
./conftest; exit) 2>/dev/null; then
ac_cv_locale_c_utf8=en_US
else
# Test for the locale name with explicit encoding suffix.
if (LC_ALL=C.UTF-8 \
LC_TIME= \
LC_CTYPE= \
./conftest; exit) 2>/dev/null; then
ac_cv_locale_c_utf8=C.UTF-8
elif (LC_ALL=en_US.UTF-8 \
LC_TIME= \
LC_CTYPE= \
./conftest; exit) 2>/dev/null; then
ac_cv_locale_c_utf8=en_US.UTF-8
else
# Test for the Solaris 7 locale name.
if (LC_ALL=en.UTF-8 \
LC_TIME= \
LC_CTYPE= \
./conftest; exit) 2>/dev/null; then
ac_cv_locale_c_utf8=en.UTF-8
else
# None found.
ac_cv_locale_c_utf8=none
fi
fi
fi
;;
esac
fi
rm -fr conftest*
])
LOCALE_C_UTF8=$ac_cv_locale_c_utf8
AC_SUBST([LOCALE_C_UTF8])
])