bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [grep 2.5.1] odd output for some patterns combining backreference, a


From: Alex Lamey
Subject: Re: [grep 2.5.1] odd output for some patterns combining backreference, alternation, and repetition
Date: Tue, 2 Oct 2007 15:01:25 -0400

Hi Ralf,
Yes, that's right. I neglected to cc the list when replying to Bauke, so see below. The problem does appear to be platform-specific.
alex

I'm running Mac OS X/Darwin. I've confirmed that I get the same output on two different Macs (G5 and Intel). But other unix-based platforms don't seem to exhibit the error. I've also reported the bug at http://savannah.gnu.org and http://developer.apple.com/ bugreporter.
alex

On 2-Oct-07, at 1:22 PM, Bauke Jan Douma wrote:


Alex Lamey wrote on 02-10-07 09:41:

Not a bug -- assuming the ! is and envisualization of your surprise.

Yes, ! is an error marker as I wrote

The [0-9] means just one digit in that range.  So each of these
two constructs matches at most just one digit. Hence \1 can only be
a reference to at most one digit.  Same for \2.  And hence the same
for \1|\2.

That's what I expected. (\1|\2) should match exactly 1 digit. In my examples, just the digit '8'.

Therefore {1} gives you just one digit on the right side of the dash, {2} gives you 2, and anything else just doesn't match, because you're
asking too many: {n} means /exctly/ n matches.

So why does (\1|\2){4} match just one '8'? Why does (\1|\2){6} match two, etc.


It doesn't output that here.
I am running grep 2.5.1 also.

Are you sure that's your output??

bjd





On 2-Oct-07, at 2:42 PM, Ralf Wildenhues wrote:

Hello Bauke, Alex,

* Bauke Jan Douma wrote on Tue, Oct 02, 2007 at 12:10:20AM CEST:
Alex Lamey wrote on 30-09-07 22:03:

$ echo 88-88 | egrep -o '([0-9])([0-9])-(\1|\2){3}'
88-88
    !
[...]
Not a bug -- assuming the ! is and envisualization of your surprise.

The [0-9] means just one digit in that range.  So each of these
two constructs matches at most just one digit.  Hence \1 can only be
a reference to at most one digit.  Same for \2.  And hence the same
for \1|\2.
Therefore {1} gives you just one digit on the right side of the dash,
{2} gives you 2, and anything else just doesn't match, because you're
asking too many: {n} means /exctly/ n matches.

Maybe I'm being dense here, but the above example (just like the others
that Alex annotated) asks for three digits after the hyphen, but egrep
actually matches with two.  Looks like a bug to me, no?

FWIW, I can't reproduce the above with version 2.5.1 on GNU/Linux.

Cheers,
Ralf





reply via email to

[Prev in Thread] Current Thread [Next in Thread]