gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Incompatible compiler option fexec-charset


From: Richard Frith-Macdonald
Subject: Re: Incompatible compiler option fexec-charset
Date: Wed, 7 Dec 2011 13:19:32 +0000

On 7 Dec 2011, at 12:35, David Chisnall wrote:

> On 20 Nov 2011, at 19:45, Richard Frith-Macdonald wrote:
> 
>> 
>> On 20 Nov 2011, at 11:38, David Chisnall wrote:
>> 
>>> This flag also isn't recognised by clang.  What does GCC 4.x need it for?
>> 
>> The -fexec-charset=UTF-8 tells the compiler to encode string literals as 
>> UTF-8 in the binary.  This allows developers to put any character they like 
>> in a string literal and have GNUstep get things right at runtime because 
>> base knows the compiler will have encoded all literals as UTF-8
>> 
>> It's not actually clear what the compiler did prior to that option being 
>> introduced ... from what I've read it seems likely that it simply used 
>> whatever string encoding was set in the locale that was in use at the time 
>> when the code was compiled, with no mechanism to know what that encoding was 
>> at the point when the executable would run.
>> 
>> So the only drawback to removing the option for older compilers is that 
>> non-ascii string literals would malfunction (but such literals have simply 
>> been illegal up to now anyway) ... so it would be reasonable to have an 
>> autoconf check to see if the option works, and disable it and print a 
>> warning.  I hate writing autoconf stuff though, so I'd rather someone who's 
>> interested in supporting old compilers did it.
> 
> 
> I misunderstood why we were using this option.  I was under the impression 
> that it was related to the encoding of NSConstantString objects, which should 
> be UTF-8 by default.

That's right.

> The check in the configure script (which breaks the build with clang now - 
> apparently it was not tested before being committed)

The script provides instruction on how to ignore the check for compilers which 
don't support the 'standard' gcc behaviors.  That worked on my system when I 
tested it.  I put that option in for old versions of gcc and because the latest 
info I managed to find for clang was that it didn't support characterset 
specifier flags and didn't check what characterset it was writing using for 
string literals.

> is testing for something very different - it is checking whether we can put a 
> latin1 character in a source file and have the compiler magically know that 
> the source is latin1 and translate it to UTF-8.

It's testing to see if we need to use the command line options (to force the 
use of UTF-8) or not ... by seeing if the compiler stores the string correctly 
(ie as UTF-8) in the executable without it.

> This is amazingly fragile, because it requires that the compiler guess that 
> the source file is latin1.
>  If you want UTF-8 characters in C string literals then you should save the 
> file in UTF-8 format or (better) you should use the correct escape sequences. 

Great idea ... but not what the gcc documentation says ... how would we enforce 
it on our users?
The gcc documentation says the source characterset is (by default) whatever the 
current locale says it is (or UTF-8 if the compiler can't determine it from the 
locale) ... unless overridden by the -finput-charset=  command line option.  
The check sees if the compiler is performing according to those rules (in which 
case no command line options are needed), or if the compiler supports the 
options to specify the charactersets (in which case we use those options).  If 
you don't want the check (either you don't have any non-ascii literals, or you 
are sure your compiler will be generating UTF-8 output) you can disable it.

> If we are depending on the compiler doing this translation anywhere in 
> GNUstep then we should fix that.  Are we? 


We shouldn't be using non-ascii string literals anywhere in our source.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]