gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: setlocale() [Was: Re: NSNumberFormater test fails]


From: Fred Kiefer
Subject: Re: setlocale() [Was: Re: NSNumberFormater test fails]
Date: Tue, 28 Feb 2012 19:08:04 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.2) Gecko/20120215 Thunderbird/10.0.2

Thank you very much! For me this patch gets all the tests in base to run correctly, apart from 19 dashed hopes, which is less than anything I had before.

Fred

On 28.02.2012 07:49, Eric Wasylishen wrote:
Hi,
I committed a slightly modified version of locale3.diff.

This fixes the NSNumberFormatter test failures when a non-English locale is 
used (I tested French).

Let me know if you have any problems with this.

-Eric

On 2012-02-12, at 6:25 PM, Robert Slover wrote:

Honestly this seem right to me, or at least safest, particularly for library 
code.  There are too many fundamental APIs that encapsulate the current locale 
as part of their internal state without being able to preserve it. A good 
example would be a compiled regular expression -- bad things happen if the 
locale changes after the regular expression is parsed and compiled but before 
it is used.  This is one of the current weaknesses of POSIX, IMHO.

--Robert

On Feb 12, 2012, at 15:31, Fred Kiefer<address@hidden>  wrote:

That means you were right, Cocoa doesn't call setlocale(LC_ALL, "") 
automatically. I am actually surprised by that, but if they don't do neither should we.

Fred

On 11.02.2012 00:40, Eric Wasylishen wrote:
Btw, I attached a test program test.m which shows how the current AppKit locale 
(and libc locale) affects various ways of printing decimal points:

Mac OS 7.2, locale set to French Canadian in SystemPreferences:
new-host-2:~ ericw$ echo $LANG
fr_CA.UTF-8
new-host-2:~ ericw$ gcc test.m -framework Foundation -framework AppKit
new-host-2:~ ericw$ ./a.out
2012-01-29 17:52:58.642 a.out[1215:707] Launched. current locale: fr_CA
2012-01-29 17:52:58.643 a.out[1215:707] NSLog Decimal test: 1.23
printf decimal test: 1.23
2012-01-29 17:52:58.644 a.out[1215:707] Calling setlocale(LC_ALL, "")...
2012-01-29 17:52:58.645 a.out[1215:707] NSLog bonjour! 1.23
printf bonjour! 1,23
2012-01-29 17:52:58.645 a.out[1215:707] -[NSString stringWithFormat:]: 1.23
^C

GNUstep trunk:
address@hidden:~$ export LC_ALL=fr_CA.UTF-8
address@hidden:~$ gcc `gnustep-config --objc-flags` test.m `gnustep-config 
--gui-libs` -o test
address@hidden:~$ ./test
2012-02-10 16:16:32.203 test[14990] Launched. current locale: fr_CA
2012-02-10 16:16:32.210 test[14990] NSLog Decimal test: 1.23
printf decimal test: 1,23
2012-02-10 16:16:32.211 test[14990] Calling setlocale(LC_ALL, "")...
2012-02-10 16:16:32.211 test[14990] NSLog bonjour! 1.23
printf bonjour! 1,23
2012-02-10 16:16:32.211 test[14990] -[NSString stringWithFormat:]: 1.23

The only difference is the first "printf decimal test:" on GNUstep uses the comma decimal 
separator, because of the setlocale called by +[NSObject initialize]. On Mac OS, the libc locale is 
still "C".

One interesting thing is neither NSLog nor -[NSString stringWithFormat:], on 
Cocoa or GNUstep, use the locale's decimal point (regardless of the setting of 
the AppKit locale, or the libc locale.)


On 2012-02-08, at 12:29 PM, Fred Kiefer wrote:

I think you are right as far as ICU is concerned, when we use ICU we should use 
the function uloc_setDefault() to select the locale we want. But currently the 
code we have is not ICU only, it works with a mixture of ICU and glibc and it 
should work without ICU. We could try to make sure that all our calls that need 
locale information of any sort, go through a wrapper that uses the 
corresponding ICU function when that is available. If that is achieved we could 
only use uloc_setDefault(), when ICU gets used and everything should work. (And 
fall back to the old setlocale() call in the other case.
But is this achievable? Who is willing to check which of our used glibc 
functions use any locale information? And to rewrite all of these? Just think 
of the work in NSLog and the removal of all printf calls and the like.

Actually, I don't think there would be much work to do. From what I've seen, 
gnustep-base doesn't use the libc locale system much (if at all). For example, 
GSFormat.m uses NSLocale to get the decimal separator character (it does have 
fallback code to print a decimal using printf, but from what I can see that 
will never get used.)

The only place I've found the libc locale used is in GSLocale.m, in the 
implementation of GSDomainFromDefaultLocale().

I attached a "first try" at a patch which does the following:

- deprecates GSSetLocale and GSSetLocaleC - they now do nothing.
- removes the call to GSSetLocaleC in +[NSObject initialize].
- adds a function to GSLocale.m, GSDefaultLanguageLocale(), which returns the 
locale for LC_MESSAGES. refactors two parts of NSUserDefaults that were calling 
GSSetLocale to get LC_MESSAGES to use GSDefaultLanguageLocale() instead.
- rewrite parts of GSLocale.m which need to use the libc locale. Now they call 
setlocale(LC_ALL, ""), and afterwards restore the C locale to what it was 
previously.

Of course if we were to apply this I would want to do a more careful scan of 
base for uses of the libc locale.

I am not that sure whether the selection of the locale is really up to the 
application code.
When an application gets started it has a right to expect that its supporting 
libraries are set up to a sensible default. This may not be true for tools, but 
should be the case for applications that display a user interface.

I agree with your basic point… but I would expect the GNUstep locale system to 
be set up, not the libc locale.

Eric









On 08.02.2012 19:25, Eric Wasylishen wrote:
Hi,
I just had a look in to this problem. While it sounds like there is certainly a 
bug in libicu - it should not break if the libc locale is changed - I am very 
skeptical that setting the libc locale as we do in +[NSObject initialize] (or 
anywhere else... IIRC it's also done in NSUserDefaults) is a good idea.

Just to recap, +[NSObject initialize] does setlocale(LC_ALL, ""); which reads the current locale 
from the LANG environment variable (and others)[1] and sets all of the libc locale settings to that locale - 
so after +[NSObject initiazlize], printf("%g", 1.23) will output "1,23" if your system 
locale is French, for example.

My main problem with this is, I don't think any shared library really has the 
right to change this setting… if an application/tool wanted to switch from the 
default C locale to the current system locale, that should be the application's 
decision, since it has global consequences for everything running in that 
process (changing the semantics of printf!). But there would hardly be a point 
to doing that because GNUstep provides more powerful formatting anyway 
(NSNumberFormatter, etc.)

The "official" way of setting the locale in ICU is using uloc_setDefault()

According to the ICU docs, the notion of locale in ICU is totally independent 
of libc's. For number formatting, the ICU default locale only has an effect if 
you pass NULL for the locale when calling unum_open.

So setting the libc locale should have no effect on ICU's default locale (not 
true because of the bug mentioned below), and vice-versa - setting the ICU 
locale has no effect on the system locale.

Eric

[1] actually more complicated, at least for glibc: 
http://www.gnu.org/software/libc/manual/html_mono/libc.html#Locale-Categories

On 2012-01-23, at 6:43 AM, Stefan Bidi wrote:

On Mon, Jan 23, 2012 at 3:01 AM, Fred Kiefer<address@hidden>    wrote:
That bug description is not accurate. When running the NSNumberformatter test program we 
only call setlocale() twice, once with "" and once with NULL as the locale. 
That would be supported behaviour according to the bug description, but clearly it is not.

I'll attach that modified version to the bug report.

I changed your test program to call setlocale() and now it also reports NaN 
(See attachment). This really makes me wonder whether it is such a great idea 
to use an internationalisation library that only supports English :-(

In all fairness, it has been classified as bug in the library.  The "official" 
way of setting the locale in ICU is using uloc_setDefault().  To get the locale it's 
uloc_getDefault().  But I see your point and am a little surprised that this is even an 
issue.  I'm even more surprised that it keeps getting pushed off to a later release.  It 
seems to have originally been scheduled for 4.6, then 4.8 and now 5.0.

BTW: Is there a reason why the macro STRING_FROM_NUMBER calls the conversion 
twice, even when it was successful on the first attempt? I don't like the use 
of macros that much it really makes it hard to tell what is going on and code 
in macros never gets as much review as normal code.

No, it's wrong.  I saw that when mucking around in there, too, but didn't a fix 
commit (I'll do so when I get home, today).  Don't ask me how I managed to 
screw that up... I don't know either.

On 21.01.2012 23:00, Stefan Bidi wrote:
After running a few more tests and still not understanding what is going on
I went to good and found this bug report:
http://bugs.icu-project.org/trac/ticket/8214

Seems that ICU does not like it when we use setlocale().

On Sat, Jan 21, 2012 at 1:53 PM, Stefan Bidi<address@hidden>     wrote:

I am completely baffled by this bug.  I've been trying to debug this for
the last 3 hrs and have gotten absolutely no where.  I added a unum_open
and unum_formatDouble call in -init and I still get NaN when
LANG=de_DE.UTF-8.  The test program continues to work without a hitch,
though.  Something about how we handle the NSNumberFormatterInternal
structure is screwing up UNumberFormat (I also added a unum_open and
unum_formatDouble call in basic10_4.m and it worked fine).


On Sat, Jan 21, 2012 at 11:00 AM, Stefan Bidi<address@hidden>wrote:

On Sat, Jan 21, 2012 at 10:41 AM, Fred Kiefer<address@hidden>     wrote:

Your test code works fine here and results in 1.234 as expected. My
$LANG is de_DE.UTF-8. And with $LANG set to C the test run fine.
Strange enough currentLocale ends up being en_US_POSIX


-currentLocale looks for a Locale default, if it exists that's what it
uses.  Do you have that set?

This still wouldn't explain why UNumberFormat is returning NaN.  Both you
and Philippe have a valid locale.  On the plus side, if I set
LANG=de_DE.UTF-8 I can reproduce this.  I'll go try to figure out what's
going on.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]