help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode/UTF-8 Flex


From: Hans Aberg
Subject: Unicode/UTF-8 Flex
Date: Sat, 27 Nov 2004 18:29:53 +0100
User-agent: Microsoft-Outlook-Express-Macintosh-Edition/5.0.6

Is somebody actively working on extending Flex to Unicode/UTF-8?
  Hans Aberg


UTF-8 has several interesting properties, that might make it possible to
directly extend Flex to it. One is that the 7-bit ASCII characers not not
ouccr as a part of any other multibyte encoding. So then any original Flex
7-bit pattern should be valid for ASCII only. This might mean that one can,
at need, implement entirely new Flex patterns for the other, higher Unicode
characters when using UTF-8.

It also mentions:
Starting with GNU glibc 2.2, the type wchar_t is officially intended to be
used only for 32-bit ISO 10646 values, independent of the currently used
locale. This is signalled to applications by the definition of the
__STDC_ISO_10646__ macro as required by ISO C99.

This means that the "Unicode" Flex wchar_t-patch for will in practise be
unusable, as it be try to create 2^32 sized tables.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]