help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Flex] Getting the absolute input file position


From: f_plouff
Subject: Re: [Flex] Getting the absolute input file position
Date: Sat, 16 Jul 2005 13:56:13 -0400
User-agent: Internet Messaging Program (IMP) 3.2.5

Hi Hans Aberg,

Thanks again for your reply =)

> > But how come there is no yyabspos variable or feature?
> >
> > Shouldn't flex support that kind of 'feature' out of the box,
> > why would one have to figure out that much?
>
> It has been discussed on this list, with cons and pros. One problem
> is how to sync it with locations in the Bison generated parsers.
> Perhaps that put it off.

You should really provide it, even if it's only enabled by a triggers
with a gotcha for bison corner-case.

%option inputfilepos
%option outputfilepos

> >>> Such that I can fopen, fseek, fread a "context" later on,
> >>> after the entire parsing and processing is performed.
> >>>
> >>
> >> Because of buffering, you can't use such functions directly.
> >>
> >
> > I want to use such function AFTER the parsing... not during the
> > parsing.
>
> I don't think that will help you, because I think the Flex lexer will
> just read a chunk into a buffer,

That buffer is increased up to 256KB.
However, I have no clue if it goes "wrong"
if the input source file to be parsed is >> 256KB.

> and that is where the file position
> will be after the parsing, not at actual lexing position,
> which then will be in the buffer somewhere.

Like I said, it is closed and reopen,fseek,fread,fclose.

I think you mean that the "code" inside the "case"
have the "precise" lexing location and that the current position
is not absolute. Hopefully for my case, the two rules that
I'm talking about is for "{" and "}".

So as long as "every other rules", move the file offset properly,
I'm in business.

The corner case that currently does not "work" and this is why I need
this work-around is that if the input source code lacks sufficient '\n',
then the context return is wrong.

For instance, if the entire source code is on one line,
the context is the entire source code not the "function body".


> > I did a search for %option:
>
> > %option yylineno
>
> In a lexer .c file with this option on, there is a segment right
> before the lexer switch statement that looks something like:


This seems more "acceptable".

How does it manage unput and similar re-feed?


>          YY_DO_BEFORE_ACTION;
>
>          if ( yy_act != YY_END_OF_BUFFER && yy_rule_can_match_eol
> [yy_act] )
>              {
>              int yyl;
>              for ( yyl = 0; yyl < yyleng; ++yyl )
>                  if ( yytext[yyl] == '\n' )
>
>      yylineno++;
> ;
>              }

I could try that:

if ( yy_act != YY_END_OF_BUFFER && yy_rule_can_match_eol[yy_act] )
{
  int yyl;
  for ( yyl = 0; yyl < yyleng; ++yyl )
    if ( yytext[yyl] ) yyfilepos++;
}

which looks similar to:

if ( yy_act != YY_END_OF_BUFFER && yy_rule_can_match_eol[yy_act] )
{
  yyfilepos += yyleng;
}

I didn't do that check before...

Unfortunately, previously I tried this:

/*
#define g_filepos ( ((int)(yy_c_buf_p - yy_current_buffer->yy_ch_buf))>0 ?
g_ftell + (int)(yy_c_buf_p - yy_current_buffer->yy_ch_buf) : -1 )
*/
#define g_filepos_store g_realfilepos //( g_curly_pos )

static int g_realfilepos  = 0;
static int g_filepos      = 0;
static int g_filepos_next = 0;

static int inputStringFPos = 0;

#define g_filepos_store_start   g_filepos; /*fprintf(stdout,"START_FPOS[%d]
cbufp[%x]-chbuf[%x] bufpos[%d] bsize[%d] bnchars[%d] nchars[%d] ftell[%d]
cbufp:str[%s][%s] action[%d] prog[%s] ppos[%d]\n", g_filepos_store,
(int)yy_c_buf_p, (int)yy_current_buffer->yy_ch_buf,
(int)yy_current_buffer->yy_buf_pos, (int)yy_current_buffer->yy_buf_size,
(int)yy_current_buffer->yy_n_chars, (int)yy_n_chars, (int)g_ftell, yy_c_buf_p,
yy_current_buffer->yy_ch_buf, yy_act, current->program, current->programPos
);*/

#define g_filepos_store_end     g_filepos; /*fprintf(stdout,"  END_FPOS[%d]
cbufp[%x]-chbuf[%x] bufpos[%d] bsize[%d] bnchars[%d] nchars[%d] ftell[%d]
cbufp:str[%s][%s] action[%d] prog[%s] ppos[%d]\n", g_filepos_store,
(int)yy_c_buf_p, (int)yy_current_buffer->yy_ch_buf,
(int)yy_current_buffer->yy_buf_pos, (int)yy_current_buffer->yy_buf_size,
(int)yy_current_buffer->yy_n_chars, (int)yy_n_chars, (int)g_ftell, yy_c_buf_p,
yy_current_buffer->yy_ch_buf, yy_act, current->program, current->programPos
);*/


static int g_ftell = 0;
static int g_curly_pos = 0;

// ....

do_action:      /* This label is used only to access EOF actions. */

int yyl2 = yyleng;

if ( !yytext[0] ) --yyl2;

g_filepos = g_filepos_next;
g_filepos_next += yyl2;

and in some 'case'

  current->bodyPos = g_filepos_store_start;

  current->bodyPos = g_filepos_store_end;


and obviously that didn't work well after few rules.



> Here, you should introduce a new variable besides yylineno that
> counts bytes instead, and figure out how to update it. Then put that
> alteration into your skeleton file, and make sure it is used when
> compiling the .l file with Flex.
>
> >> So there seems to be essentially two methods,
> >> first, add a byte count in each rule,
> >>
> >
> > As you can see, it's far from an *easy* grammar,
> > so modifying each rule internal without breaking anything
> > would be quite challenging.
> >
> > It has thousands of inter-related rules,
> > and a minimally coupled 50,000 LOC,
> > that's if the other flex modules are ignored from the equation
> > and the calls to the "core context manipulation" is ignored also.
> >
> > Unless you have a way to know how much can get eated, unputed,
> > manipulated
> > without adding a counter everywhere in 50KLOC.
>
> I think that some of the proponents of not adding the feature to
> Flex, argued this was fairly straightforward; perhaps you can get
> some help from those, if the other method above does not work for
> you. :-)
>
> >> and second, check out how %option yylineno is implemented, and
> >> tweak the skeleton file for your purposes.
> >>
> >
> > I don't have the skeleton file  :-|
>
> If you have Flex, you have the skeleton file, as it is used to
> produce the lexer output in ever Flex compile. In later flex
> versions, it is called flex.skl.
>
> > However, every cPP file says:
> >
> > /root/flex/flex/skel.c
> >
> > Looks like a "default skeleton".
>
> The problem is that you have an old Flex version, most people on this
> list have forgotten about. The latest version can be gotten from CVS
> flex.sourceforge.net, or lex.sourceforge.net (two different
> addresses, with different contents).

You're not "registered" on google, since any search on flex or lex,
don't report the sourceforge project. Please register to the open directory.
It should show up on Yahoo and Google, a week later...

http://dmoz.org/Computers/Open_Source/Software/


The code has the following scarry warning:

#if !defined(YY_FLEX_SUBMINOR_VERSION)
extern "C" { // some bogus code to keep the compiler happy
  void codeYYdummy() { yy_flex_realloc(0,0); }
}
#else
#error "You seem to be using a version of flex newer than 2.5.4. These are
currently incompatible with 2.5.4, and WILL NOT WORK! Please use version 2.5.4
or expect things to be parsed wrongly! A bug report has been submitted."
#endif



Sincerely yours,
Frederic Plouffe, B.Eng.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]