Re: OT: LALR(1)

help-bison

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: OT: LALR(1)

From:	Russell Lewis
Subject:	Re: OT: LALR(1)
Date:	Wed, 14 Aug 2002 08:14:15 -0700
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/20020408

Magnus Ekhall wrote:

Hi!

Aslightly off topic short question:
What is the abbreviation LALR(1) really short for?

/Magnus


_______________________________________________
address@hidden http://mail.gnu.org/mailman/listinfo/help-bison

Look Ahead Left Right parser, with (1) token of look ahead.

Now, to explain what that means...somebody sent me a link a few weeksback to an excellent online book (in PDF, IIRC) that explained it allwell, but I don't have the link accessible right at the moment.

Basically, my general understanding is that LR parser parses the tokensfrom left to right. It has to parse a set of tokens, and thenimmediately replace the sequence of tokens with some other, higher-leveltoken. Then you start over from the left side and re-parse. Youcontinue this process until you have reduced the entire sequence down toa single end-state token. So you get rules like this:


   a => b c
   b => d e
   c => f g

   If you start with this token sequence:
       d e f g
   it gets parsed like this:
       b f g    ( b => d e )
       b c       ( c => f g )
       a          ( a => b c )
   and so you've parsed the sequence.

The problem you get with LR parsers is that you can't make choices basedon any tokens that follow AFTER your current token. So a grammar likethis can't be parsed by an LR parser:

   a => b c
   a => d e
   b => f g
   c => h i
   d => f g
   e => j k

   Take the example string:
       f g j k

The parser reads the 'f' and 'g' tokens. At that time, it musteither reduce them down to a 'b' or down to a 'd', but it can't decidewhich is right. Thus, an LR parser is stuck.

However, an LALR(1) parser is allowed to look ahead at 1 additionaltoken and make a decision based on that. Thus, it will be able to parsethe string above. It looks at the 'j' token and decides that it has tointerpret the 'f g' string must be interpreted as 'd', not as 'b'. Soit makes the conversion:

   f g j k
   d j k ( d => f g )
   d e    ( e => j k )
   a       ( a => d e )

Now, the trouble with LALR(1) parsers is that they have ONLY one tokenof lookahead. So while they are a lot more flexible than LR, they stillhave troubles.

P.S. Bison doesn't actually "start over from the beginning" afterreducing a string of tokens into a single token. If you look at themechanics of how parsing happens (a DFA, beyond the scope of thisemail), you will find that reparsing will follow the exact pattern you'dfollowed before, up to a certain point. Thus, bison keeps a stack ofstates that's it's been in and simply backs up a bit after it does atoken conversion.

[Prev in Thread]

Current Thread

[Next in Thread]

OT: LALR(1), Magnus Ekhall, 2002/08/14
- Re: OT: LALR(1), Russell Lewis <=
- Re: OT: LALR(1), Hans Aberg, 2002/08/14

Prev by Date: OT: LALR(1)
Next by Date: Re: OT: LALR(1)
Previous by thread: OT: LALR(1)
Next by thread: Re: OT: LALR(1)
Index(es):
- Date
- Thread