[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Flex and 32-bits characters
From: |
Antoine Fink |
Subject: |
Flex and 32-bits characters |
Date: |
Mon, 26 Aug 2002 15:37:33 -0400 |
(I'm sending this message a second time because I was not registered on the
help-flex mailing list the first time I sent it.. so I might've skipped a few
answers........)
I am currently working on a regular expression parser, using Flex and Yacc, and
I want to be able to parse either ASCII, UTF-8 or UCS-4 strings.
The general idea was to convert anything to ucs-4 (using 32 bits chars), parse
the regex, then re-convert (whenever possible) to the specified matching
encoding. (That part was already done some time ago when we used our own C
parsing program instead of Flex & Yacc, so this is not really the issue).
The problem is that I am unable to make Flex read in 32-bits characters (in an
easy fashion... say, typedef'ing chars to 32-bits integers, or re-#define'ing
chars has 32-bits integers, but that won't work at all, for numerous reasons.)
After reading lots of posts on the web, I have found some ways to accomplish
this 32-bit character lexing, and that which makes the most sense to me is to
(locally) modify Flex's own source code and make it generate lexers that use
32-bits integers instead of chars.
I thought of using Mark Weaver's unicode patch description as a basis for my
own modifications. I've included this patch description as an attachment
(flex-unicode-patch-mweaver-01-24-2002.txt). It can also be found on geocrawler
in the 'Unicode Support In Flex' thread of the GNU - help-flex - Help Flex
mailing list : http://www.geocrawler.com/archives/3/353/2002/1/0/7639113/
(If you have a minute to look at it) I'd be interested in your opinion with
regards to the approach he used, so I can do as good or even better, thus
producing something that will actually be usefull for you and all futur Flex
users..
Thank you very much.
------------
Antoine Fink,
Co-op Software Designer
Solidum Systems corp http://www.solidum.com
(613)724-6004 x268 address@hidden
Re: Flex and 32-bits characters, Antoine Fink, 2002/08/26
RE: Flex and 32-bits characters, Mark Weaver, 2002/08/26
Re: Flex and 32-bits characters, Mark Weaver, 2002/08/26
Flex and 32-bits characters,
Antoine Fink <=