|
From: | Paul Eggert |
Subject: | bug#37659: rx additions: anychar, unmatchable, unordered-or |
Date: | Wed, 23 Oct 2019 16:14:45 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 |
On 10/23/19 2:15 AM, Mattias Engdegård wrote:
how do we make it easy to match one of multiple strings --- keywords, say --- in rx?
If that's the real problem, perhaps the name should be "or-tokens" or something like that, to help remind the reader of the limitations of the proposed operator: it's meant only for greedy tokenization and it isn't suited for regular expressions in general. A problem with the name "or-max" is that it implies a more-general functionality than the implementation really has.
What happens if you apply or-tokens to arguments that aren't strings or other or-tokens? Does rx diagnose this? I hope it does.
We could say that 'or' and \| either match greedily or in left-to-right order. However, I'm not sure this solves any problem right now.
I was thinking of something more-compatible: we could say that \| is left-to-right (for users who need compatibility with regexp "|"), and that 'or' is not necessarily left-to-right (to make room for future extensions that make 'or' greedy, or more efficient, or both).
[Prev in Thread] | Current Thread | [Next in Thread] |