help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Stack overflow in regexp matcher


From: Dan Davison
Subject: Re: Stack overflow in regexp matcher
Date: Tue, 08 Feb 2011 22:58:15 +0000
User-agent: Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (darwin)

Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> The following fails with "Stack overflow in regexp matcher" in emacs 23
>> and 24:
>
>> (string-match
>>  "^\\[.+\\]$"
>>  (concat
>>   "["
>>   (mapconcat (lambda (i) "x") (number-sequence 1 33500) "")
>>   "]"))
>
>> This surprised me; I assumed that the ^ and $ anchors, and the simple
>> ".+" requirement in the middle would result in a simple, efficient
>> regexp.
>
> The problem is that Emacs's regexp engine is not very clever.
>
> Basically, each time . succeeds we first recurse, trying to match the
> rest against ".*\\]$", and if that fails then we try to match
> "\\]$" instead.  The fact that at each step we have this "fork" of two
> different possible ways to match, means that at each step we have to
> push something on the stack, so the longer the string against which we
> match, the deeper the stack we need.
>
> You can fix it by using a regexp that does not need this backtracking.
> E.g. "^\\[[^]\n]+\\]$".  If you do that, then Emacs's regexp engine will
> notice that when [^]\n] matches, then "\\]$" can't match, so there's no
> point pushing something on the stack to try the second path when the
> first fails.  I.e. it will do the whole match with a constant amount of
> stack space.
>
> This relates also to the recent discussion with Ilya in the thread
> titled "will we ever have zero width assertions in regexps?".

Hi Stefan,

Thanks for the explanation.

Dan

>
>
>         Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]