bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: enum instead of #define for tokens


From: Hans Aberg
Subject: Re: RFC: enum instead of #define for tokens
Date: Thu, 4 Apr 2002 19:44:37 +0200

At 17:31 +0200 2002/04/04, Akim Demaille wrote:
>Hans> Right know, I do not know how scanner errors are handed over to
>Hans> the parser -- perhaps there should be a macro for that in the
>Hans> .tab.h header.
>
>Everybody out of range is mapped to $undefined.

If the scanner detects an error, then one would want to be able to return
an error code to the Bison generated parser. How do you do that?

I have put a
  #define YYERRCODE     256
in my .l file, so I can return YYERRCODE, but this use is not supported.
Should not some macro like this be in the .tab.h header?

One alternative might be
  #define YYTOKENERROR 257
It might work so that the Bison generated parser then processes the error
without first calling yyerror (assuming the lexer is producing the error
message).

>>> All this discussion is anyway not taking into account the impact
>>> that Unicode can have on the size of the Bison tables.
...
>Hans> I am not sure what you are speaking about here:
...
>Where do you put them?  What's the impact on the size of the tables?

Sorry; you are right: yytranslate[] puts the characters flat off (but that
seems to be the only problematic table). For Unicode, one will need some
kind of table compression scheme.

Perhaps the macro YYTRANSLATE should be changed, so for characters in the
Unicode range, it calls a function that makes binary search in a sorted
table.

>Plus all the needed stuff to deal with the different encodings of
>Unicode.

I think that one will most likely only deal with UTF-n, where n >= 21. :-)
This avoids variable width characters. For other encodings, hang on
external code converters.

  Hans Aberg





reply via email to

[Prev in Thread] Current Thread [Next in Thread]