[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Egrep Version 2.3
From: |
Hans-Bernhard Broeker |
Subject: |
Re: Egrep Version 2.3 |
Date: |
27 Jun 2001 12:14:04 GMT |
address@hidden <address@hidden> wrote:
> egrep -i -e -f test.txt test1.htm matches, but for test2.htm does not (line
> break before closing '>')
That's expected behaviour. "grep" and it's brethren don't consider any
match longer than a single line of input text.
The breakage in your particular example happens here:
> ([^>]*> a close bracket (as long as it's not nested)
[^>] does not match a line break.
> I was trying to see if Regular expressions could do syntax parsing without
> becoming a mile long.
They can --- but grep can't, because of its operating in terms of
'lines'.
> For hairy parsing like this do you recommend YACC? Or does sed have
> extensions to handle it??
In sed, you'ld have to fiddle with the hold buffer to eliminate the line
breaks.
The tool you really should be looking at is neither yacc/bison, nor
sed, nor {e|f|}grep: it's (f)lex. In flex scanners, a [^>] does,
indeed, match a \n.
--
Hans-Bernhard Broeker (address@hidden)
Even if all the snow were burnt, ashes would remain.
- Egrep Version 2.3, address@hidden, 2001/06/27
- Re: Egrep Version 2.3,
Hans-Bernhard Broeker <=