[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Incompatible compiler option fexec-charset

From: David Chisnall
Subject: Re: Incompatible compiler option fexec-charset
Date: Wed, 7 Dec 2011 12:35:27 +0000

On 20 Nov 2011, at 19:45, Richard Frith-Macdonald wrote:

> On 20 Nov 2011, at 11:38, David Chisnall wrote:
>> This flag also isn't recognised by clang.  What does GCC 4.x need it for?
> The -fexec-charset=UTF-8 tells the compiler to encode string literals as 
> UTF-8 in the binary.  This allows developers to put any character they like 
> in a string literal and have GNUstep get things right at runtime because base 
> knows the compiler will have encoded all literals as UTF-8
> It's not actually clear what the compiler did prior to that option being 
> introduced ... from what I've read it seems likely that it simply used 
> whatever string encoding was set in the locale that was in use at the time 
> when the code was compiled, with no mechanism to know what that encoding was 
> at the point when the executable would run.
> So the only drawback to removing the option for older compilers is that 
> non-ascii string literals would malfunction (but such literals have simply 
> been illegal up to now anyway) ... so it would be reasonable to have an 
> autoconf check to see if the option works, and disable it and print a 
> warning.  I hate writing autoconf stuff though, so I'd rather someone who's 
> interested in supporting old compilers did it.

I misunderstood why we were using this option.  I was under the impression that 
it was related to the encoding of NSConstantString objects, which should be 
UTF-8 by default.

The check in the configure script (which breaks the build with clang now - 
apparently it was not tested before being committed) is testing for something 
very different - it is checking whether we can put a latin1 character in a 
source file and have the compiler magically know that the source is latin1 and 
translate it to UTF-8.

This is amazingly fragile, because it requires that the compiler guess that the 
source file is latin1.  If you want UTF-8 characters in C string literals then 
you should save the file in UTF-8 format or (better) you should use the correct 
escape sequences.  

If we are depending on the compiler doing this translation anywhere in GNUstep 
then we should fix that.  Are we? 


-- Sent from my brain

reply via email to

[Prev in Thread] Current Thread [Next in Thread]