[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: Flex and 32-bits characters
From: |
Hans-Bernhard Broeker |
Subject: |
RE: Flex and 32-bits characters |
Date: |
Mon, 26 Aug 2002 11:04:13 +0200 (MET DST) |
On Mon, 26 Aug 2002, Mark Weaver wrote:
> So now maybe I'm matching names with an expression like:
>
> [A-Za-z]+[A-Za-z0-9_]*
>
> How do I say that for a unicode flex?
Seems to me the obvious answer would be:
[:alpha:]([:alnum:]|_)*
Flex does have character classes, and Unicode is definitely one case where
they should take precedence over old-fashioned [a-z] & friends.
> Oh nearly forgot. Yes, keeping track of UTF-16 strings is a little bit of a
> pain, but it's not too vast. There are macros provided that will help you
> iterate through a string.
Iteration isn't enough. Not by a wide margin, I think. The real problem
is free-flow navigation. Flex expects strings to be freely accesible
arrays. It expects to be able to index state arrays by input character, so
I don't see how it will ever be able to work with a variable-length
representation.
> And I would simply pass yyleng as the correct character count, and
> yytext as the UTF-16 string. The user can take it from there.
Not efficiently. Let's say the user needs a copy of the current yytext
for later reference. yyleng alone doesn't tell him how much memory to
allocate. So he'll have to run (a UTF-16 version of) strlen() over the
result, just as if yyleng hadn't been available in the first place.
Either that, or waste memory. AFAICS, both length (=number of characters)
and size (=number of bytes) of the current yytext would have to be
exported by flex. And having the two of them would probably confuse
beginning users endlessly.
--
Hans-Bernhard Broeker (address@hidden)
Even if all the snow were burnt, ashes would remain.
- Flex and 32-bits characters, Antoine Fink, 2002/08/23
- Re: Flex and 32-bits characters, Hans Aberg, 2002/08/24
- RE: Flex and 32-bits characters, Mark Weaver, 2002/08/24
- RE: Flex and 32-bits characters, Hans Aberg, 2002/08/24
- RE: Flex and 32-bits characters, Mark Weaver, 2002/08/26
- RE: Flex and 32-bits characters, Mark Weaver, 2002/08/26
- RE: Flex and 32-bits characters,
Hans-Bernhard Broeker <=
- RE: Flex and 32-bits characters, Mark Weaver, 2002/08/26
- RE: Flex and 32-bits characters, Hans Aberg, 2002/08/26
- RE: Flex and 32-bits characters, Hans Aberg, 2002/08/26
- Message not available
- RE: Flex and 32-bits characters, Hans Aberg, 2002/08/26
Re: Flex and 32-bits characters, Antoine Fink, 2002/08/26