[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)]
From: |
Stefan Kangas |
Subject: |
bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)] |
Date: |
Fri, 22 Oct 2021 19:47:53 -0700 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) |
Stefan Monnier <monnier@IRO.UMontreal.CA> writes:
>>> Is this a defect in my regexp or in the regexp engine?
>> It is fundamental to the way regexp matching works.
>
> To clarify: it is fundamental to the way *our* regexp engine works.
>
> As long as the regexp doesn't use backrefs, it can be matched
> efficiently, without backtracking. Of course using \(..\) (as opposed
> to using \(?:..\)) can also make the problem harder since the various
> different (but largely equivalent) ways to match might need to be
> distinguishable via match-data.
>
> But even tho your regexp doesn't use backrefs, and even if you replace
> all \(..\) with \(?:..\), your regexp will still cause problems because
> our regexp engine does not try to optimize these kinds of cases.
>
> So you have to do it by hand.
>
>>> If the former, how could I rewrite the regexp so that it would not hit
>>> these problems?
>
> Maybe something like:
>
> /\*\(<insidecomment>\)*\*+/
>
> where <insidecomment> is something like
>
> [^'*]\|\*+\([^/'*]\|'<afterquote>\)\|'<afterquote>
>
> where <afterquote> is something like
>
> \([^'*]\|\*+[^/'*]\)*'
>
> Tho this will still push a backtrack point for every character.
> Maybe better would be something like
>
> /\*[^'*]*\(<insidecomment>\)*\*+/
>
> where <insidecomment> is something like
>
> \(\*+[^/'*]\|\**'<afterquote>\)[^'*]*
>
> where <afterquote> is still something like
>
> \([^'*]\|\*+[^/'*]\)*'
>
> so that we should only push a backtrace point when we see a * or a ' in
> the comment.
Should we do anything about this, like document it in etc/PROBLEMS, or
should this bug just be closed?
- bug#18577: Regexp I-search: [(error Stack overflow in regexp matcher)],
Stefan Kangas <=