help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Parsing input from a stream... [Was: Re: Parsing input from a string...]


From: Dave Trombley
Subject: Parsing input from a stream... [Was: Re: Parsing input from a string...]
Date: Thu, 31 Jan 2002 14:47:30 -0500
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.2) Gecko/20010628

John W. Millaway wrote:

In the next release, you will be able to change the initial buffer size.
Currently, people do this with sed/perl by redefining YY_BUF_SIZE, which is by
default 16k.

I've downloaded the devloper's pre-release from your website, and I've been playing with that.

What do you want to know about the buffering? By default, flex requests as many
bytes "up front" as it can get, or as much as it needs to match something.
Obviously, you can change this behavior by returning a different # of bytes
from YY_INPUT than flex requested.
I suppose what I'm interested in doing is to understand how the yy_*buffer* functions work. Correct me if I'm mistaken, but they seem to assume to a large degree that files will be the underlying data source for the buffers (although there are functions for specifically copying strings into a new buffer), and more broadly, that all of the data will be avaialable by the time the lexer entry point is reached. What I'd really like to be able to do is to have a parser/lexer pair which is fully reentrant, and have the lexer drain a stream until either the parse terminates, or the stream is empty. In the latter case, I'd like the parser/lexer to block on stream input until more is available. It seems I could implement this in 2.5.6, especially given the fact that you can pass extra data along in a reentrant lexer, but I'm having trouble because I don't know the exact contract for the buffer functions. (For example, should I assume that YY_INPUT will only ever be called from a single place? How can I access the extra data from that place? Is that data placed into a flex buffer? Is there a more low level way of getting input to the lexer, since my MT buffers will be around anyway?)

Are there any plans/thoughts about making the buffer system extensible and abstract? Do you think it would be possible/desirable for me to attempt this, and could it be done without sacrificing performance?

   Cheers,

   -dj






reply via email to

[Prev in Thread] Current Thread [Next in Thread]