gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Incompatible compiler option fexec-charset


From: David Chisnall
Subject: Re: Incompatible compiler option fexec-charset
Date: Wed, 7 Dec 2011 12:35:27 +0000

On 20 Nov 2011, at 19:45, Richard Frith-Macdonald wrote:

> 
> On 20 Nov 2011, at 11:38, David Chisnall wrote:
> 
>> This flag also isn't recognised by clang.  What does GCC 4.x need it for?
> 
> The -fexec-charset=UTF-8 tells the compiler to encode string literals as 
> UTF-8 in the binary.  This allows developers to put any character they like 
> in a string literal and have GNUstep get things right at runtime because base 
> knows the compiler will have encoded all literals as UTF-8
> 
> It's not actually clear what the compiler did prior to that option being 
> introduced ... from what I've read it seems likely that it simply used 
> whatever string encoding was set in the locale that was in use at the time 
> when the code was compiled, with no mechanism to know what that encoding was 
> at the point when the executable would run.
> 
> So the only drawback to removing the option for older compilers is that 
> non-ascii string literals would malfunction (but such literals have simply 
> been illegal up to now anyway) ... so it would be reasonable to have an 
> autoconf check to see if the option works, and disable it and print a 
> warning.  I hate writing autoconf stuff though, so I'd rather someone who's 
> interested in supporting old compilers did it.


I misunderstood why we were using this option.  I was under the impression that 
it was related to the encoding of NSConstantString objects, which should be 
UTF-8 by default.

The check in the configure script (which breaks the build with clang now - 
apparently it was not tested before being committed) is testing for something 
very different - it is checking whether we can put a latin1 character in a 
source file and have the compiler magically know that the source is latin1 and 
translate it to UTF-8.

This is amazingly fragile, because it requires that the compiler guess that the 
source file is latin1.  If you want UTF-8 characters in C string literals then 
you should save the file in UTF-8 format or (better) you should use the correct 
escape sequences.  

If we are depending on the compiler doing this translation anywhere in GNUstep 
then we should fix that.  Are we? 

David

-- Sent from my brain


reply via email to

[Prev in Thread] Current Thread [Next in Thread]