bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#37659: rx additions: anychar, unmatchable, unordered-or


From: Mattias Engdegård
Subject: bug#37659: rx additions: anychar, unmatchable, unordered-or
Date: Tue, 22 Oct 2019 17:14:08 +0200

'regexp-opt' always generates a regexp preferring long matches. This is 
undocumented, but useful enough that I would be surprised if this property 
wasn't exploited (perhaps unknowingly) by callers. It's quite natural: given a 
set of strings, surely the caller want them all to be candidates for a match, 
even if there is no following anchoring pattern.

Thus, instead of 'unordered-or', define the operator in terms of long matches: 
'or-max' (working name) would work like 'or' but guarantee a longest match, and 
only permit strings and 'or-max' forms as arguments. Thus, the rx user gets all 
the benefits from 'regexp-opt' in a composable way, without a need to sort the 
strings or otherwise prepare them.

(The old 'or' behaviour always used 'regexp-opt' when possible, which was very 
fragile: (or "a" "ab") would match "ab", but (or "a" "ab" digit) would just 
match "a". 'or-max' is robust, without surprises.)

Of course, we should also guarantee the maximum-matching property of 
regexp-opt. This is just a matter of documentation (and test); it does not 
restrict optimisations as far as I can tell.

Again, I'm open to suggestions about a better name than 'or-max'.

The other patches (anychar, unmatchable, and [^z-a]) have been pushed to master.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]