gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New ABI NSConstantString


From: Ivan Vučica
Subject: Re: New ABI NSConstantString
Date: Sat, 07 Apr 2018 10:03:02 +0000


On Sat, Apr 7, 2018, 10:49 Richard Frith-Macdonald <address@hidden> wrote:


> On 7 Apr 2018, at 10:21, Ivan Vučica <address@hidden> wrote:
>
> On Sat, Apr 7, 2018, 09:50 David Chisnall <address@hidden> wrote:
>
>
> My current plan is to make the format support ASCII, UTF-8, UTF-16, and UTF-32, but only generate ASCII and UTF-16 in the compiler and then decide later if we want to support generating UTF-8 and UTF-32.  I also won’t initialise the hash in the compiler initially, until we’ve decided a bit more what the hash should be.
>
> Emojis don't fit UTF-16. Even if one dismisses CJK, ancient scripts etc, constant strings are not absolutely unlikely to contain emojis.
>
> Not supporting UTF-8 for internal storage may be reasonable, but not supporting UTF-32 for strings that require it seems like a bug.

Everything fits in UTF-16 (or UTF-8 for that matter).  However it's true that many/most emojis don't fit in a *single* 16bit value and require two UTF-16 (or multiple 8bit UTF-8 values) to encode them.
Since the NSString APIs assume a 16bit character width, that means an emoji will generally be treated as two characters as far as they are concerned, but that's not really a problem and current gnustep-base can/does work for emojis (for instance, sending UTF16 to mobile phones).

Acknowledged. I guess I never looked up the representation of characters with codepoints >64k in UTF-16.

Thanks to both for clarification!

reply via email to

[Prev in Thread] Current Thread [Next in Thread]