|
From: | Paul Eggert |
Subject: | Re: Char-folding: how can we implement matching multiple characters as a single "thing"? |
Date: | Mon, 30 Nov 2015 08:12:50 -0800 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 |
On 11/30/2015 07:54 AM, Artur Malabarba wrote:
Does anyone have alternative ideas?
Sure, scan the pattern greedily for possible sequences, left-to-right. In your example "fix" should expand to the regexp "\\([f][i]\\|fi\\)x" (where the "fi" is the ligature character), because once the "fi" is found, the scanner won't look for "ix" as a single character. This should cause the regexp to grow only polynomially rather than exponentially. The polynomial version won't match as many strings as the exponential version, but in practice it should be good enough.
[Prev in Thread] | Current Thread | [Next in Thread] |