help-bison
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: how to get left hand side symbol in action


From: Akim Demaille
Subject: Re: how to get left hand side symbol in action
Date: Fri, 10 May 2019 07:24:51 +0200

Hi Christian,

> Le 9 mai 2019 à 13:22, Christian Schoenebeck <address@hidden> a écrit :
> 
> On Donnerstag, 9. Mai 2019 08:50:27 CEST Akim Demaille wrote:
>>> Perhaps a variation of $ and @ that gives access to the name,
>> 
>> I am very uncomfortable with this.  Symbol names are technical details,
>> most of the time they are irrelevant to the end user, just like the
>> the user of a piece of software does not care about the names of the
>> functions: that's a implementation detail.
> 
> That's actually a very common required feature in practice. The main use 
> cases 
> are:
> 
> 1. Auto completion features
> 
> 2. Human friendly error messages

I think you are referring to the name of the tokens, not all the symbols.
For the error messages, it makes sense.  Although I am now more convinced
that most of the time, error messages are more readable when you quote
exactly the source than when you print token names.

> I do need these features for almost all parsers, hence for years (since not 
> available directly with Bison) I have a huge bunch of code on top of the 
> internal skeleton code to achieve them.

Is there available for reading somewhere?  Was the feature always fitting
perfectly?  Never ever did it result in something somewhat incorrect?

> With the obvious problems:
> 
> - since based on skeleton internals, that extra code might break with new 
>  Bison version (which it did already several times)
> 
> - since the generated parser tables are a compressed representation of the 
>  grammar, designed just to resolve grammar rules efficiently at runtime,
>  hence information is missing for the tasks above that must be extra polated 
>  with additional custom code to achieve those feature ATM

Aren't you referring to LA correction, as implemented in Bison?

https://www.gnu.org/software/bison/manual/html_node/LAC.html


>> In addition, tokens have several names: the identifier, and the
>> string name, like
>> 
>> %token <string> ID "identifier"
>> 
>> Not to mention that I also want to provide support for
>> internationalization.  So what name should that be? ID?
>> identifier? or identifiant in French?
> 
> In practice you just need the symbol name as is. Nobody needs the translation,

I beg to disagree.  Nobody should translate the keyword "break",
but

> # bison /tmp/foo.y
> /tmp/foo.y:1.7: erreur: erreur de syntaxe, : inattendu, attendait char ou 
> identifier ou <tag>
>     1 | %token: FOO
>       |       ^

looks stupid; "char", "identifier" and "<tag>" should be translated.

> or if somebody really needs it, then anybody can achieve this very, very 
> easily on its own. That's trivial.



> The main reason why people are asking for support on Bison side for the 
> features discussed here is that it is currently not trivial to achieve them 
> with own, custom code. You do need to have a profound knowledge of how the 
> internal skeleton algorithm works to be able to extra polate the missing 
> information by custom code on top of it.

I understand that, and I am willing to help.  However, I want to make
sure:

1. there is a real and valid need for the feature, which I still need
   to be convinced of, especially because symbol names are technical
   details!  Often one needs to massage grammars to have them be LALR,
   which introduces auxiliary symbols that have no intrinsic meaning
   for the language itself.

   GLR grammars are quite a different beast, precisely because they
   don't have to be massaged, they are more "natural".  But still can
   be "uglified" by "useless" intermediate non terminal to deal with
   precedence.

2. that the real feature is really these names, not something else,
   such as improved error messages.  In which case, what we need to do
   (and it's been on my todo for quite a while already) is to improve...
   the error message generation.

3. this is certainly something useful for the generation of ASTs.  But
   here too the feature is not symbol names, it's AST generation.
   Which needs AST node constructor names, not symbol names.

4. that the real feature really fit the constraints of being strictly
   an identifier.  It seems obvious to me that the quality of an internal
   identifier and what is shown to the user are distinct.  Fusing them
   is wrong.  That's why tokens in Bison have both an identifier and
   a "string name" meant for the user.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]