help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: still more 2.5.11 comments


From: Bruce Lilly
Subject: Re: still more 2.5.11 comments
Date: Tue, 13 Aug 2002 02:55:03 -0400
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020529

John Millaway wrote:
1. the header produced is a bit large
2. Quite a large fraction of the file is effectively chopped
   out by the preprocessor, but that still requires processing
   time for every source file that includes the header,


Probably 99% of the file is chopped out by the preprocessor based on a single
symbol.

The preprocessor pulls in a lot of stuff from the included
system headers, so preprocessing increases the line count
by a factor of two, though character count drops by about
85%.  The "single symbol" is itself defined at the top of
the generated header, so there's no practical way to change
what the preprocessor chops out, short of editing the
header.

> It would not be out of the question for flex to process out the bulk of
the code while generating the header. My concern here is that the improvement
in compilation time may not be noticable. My guess is that the flex header is
> small beans for the preprocessor since most of it is discarded.

That depends on the system; older systems with slow disk
I/O will suffer considerably.  Especially if the header
is included by multiple .c files.

as well
   as increasing the size of any distribution that includes the
   header.  Example: I have a 1000 line, 32 kB .l file; the
   generated header is more than 11000 lines and > half a megabyte.


Yes, it is basically a copy of the .c file. By my estimate, your bzipped
distribution will grow by about 26k due to this header file. gzip doesn't do as
well, though. Either way, it's nothing compared to all the autoconf junk.

I've not used bzip for the distribution and I don't use autoconf;
500 kB+ is significant.  I could slice and dice the header with sed,
awk, m4 and cpp, but it sure would be nice if flex omitted
the executable code in the first place (but exposing
start conditions and other non-executable macros).  Probably
several of the system headers that are pulled in could also
be eliminated along with the executable code.

3. The start condition definitions are in there too (and they
   would be useful) but they're in the part that the preprocessor
   discards...


The start conditions do not have prefixes, and will cause conflicts. So they
will have to be prefixed if they go into the header. Thanks for pointing this
out!

Start condition names basically have the same namespace constraints
as any preprocessor macros.  They probably shouldn't be changed
from what is specified in the .l file -- it's up to the author to
avoid conflicts.  The names and values should be exactly as they
are in the .l and .c files, as the values may be referenced by
name in other .c files, with the value passed back to the lexical
analyzer. Currently I do use the start conditions; I extract them
from the .c file using sed.  The lexical analyzer calls a gperf-
generated function for keywords and that function returns a start
condition which is then pushed onto the analyzer's stack (it also
returns the value which is returned by the call to yylex).  Please
*don't* add prefixes; if some user wants names such as BLETCH_foo,
then he should specify "%s BLETCH_foo".  The present method (which
is largely compatible with lex) where %s Foo becomes #define Foo N
works fine; the start condition "Foo" is referenced as "Foo" rather
than something completely different such as "BLURFL_GLURGLE_Foo".

4. A cross-reference from section 18 of the documentation would
   be helpful.


Consider it done.

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]