help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: how to recognize a multiple line comment ?


From: Vincent Zweije
Subject: RE: how to recognize a multiple line comment ?
Date: Thu, 7 Apr 2005 14:20:52 +0200

Bruce Lilly wrote:

> On Thu April 7 2005 02:29, James Yu wrote:
> > Dear all,
> > 
> > I am trying to define a rule for recognizing a multiple line comment
in C, 
> > but I have not yet sueccessed.
> > Thus, I am posting my lex segment to this email and hope you can
point out 
> > some mistakes for me.
> > 
> > == flex code segment ==
> > slash "/"
> > asterisk "*"
> > comment ({slash}{asterisk}+([^*]|[\n])*{asterisk}+{slash})
> > == flex code segment ==
> 
> Don't try to do too much in lexical analysis; some things are better
> handled in parsing under control of a grammar, where context is
> available.
> 
> For example, your pattern won't handle the legal C comment
> /* foo ***** bar */

That's exactly why he came here.  It didn't work.

It also sounds like a classical homework assignment.  ;-)

> and it will inappropriately match the text in the quoted string in
> char foo[] = "/*bar*/";

That depends on the rest of the token definitions (the string
token, to be precise).

> A properly-designed grammar will also allow resolving conflicts by
> using precedence and associativity to handle complex cases

Opinions probably differ about that statement.

> int x, y, z, *px;
> char foo[3];
> 
> x = 2;
> px = &x;
> y = 8;
> z = y/*px;
> ...
> strncpy(foo, "*/", sizeof(foo));
> 
> i.e. you have the opportunity to decide whether * binds more tightly
> to / for comments than as a pointer indication.

You must be concluding that the C syntax is not properly
designed, because it will take the /*px ... "*/ as a comment and
reject the program as invalid.  You might have a point there
though.  :-)

I would *definitely* not want to have to resolve this problem
using precedence.

However, you are mixing tokenisation (flex task) and parsing
(yacc/bison task).  They are based on different kinds of
language (regular versus context-free), and both have their
advantages and disadvantages.

Ciao.                                                   Vincent.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]