emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Make regexp handling more regular


From: Lars Ingebrigtsen
Subject: Make regexp handling more regular
Date: Wed, 02 Dec 2020 10:05:25 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Today's idle shower thought:

I constant source of confusion and subtle bugs is the way Emacs does
regexp match handling: The way `string-match' (and the rest) sets a
global state, and you sort of have to catch them "early" is often a
challenge for new users.

Experienced Emacs Lisp programmers know to be safe and will say:

(when (string-match "[a-z]" string)
  (let ((match (match-string 0 string)))
    (foo)
    (bar match)))

while people new to Emacs Lisp will expect this to work:

(when (string-match "[a-z]" string)
  (foo)
  (bar (match-string - string)))

And sometimes it does, and sometimes it doesn't, depending on whether
`foo' also messes with the match data.

So my idle shower thought for the day is: Is there any reasonable path
forward that the Emacs Lisp language could take here?

Well, we obviously can't alter functions like `string-match' and
`re-search-forward' -- they have well-defined semantics, and we can't
make them return a match object.  But we could make a new set of
functions that are more, er, functional.

Naming is, of course, the most difficult problem here.  I wondered
whether the namespace would allow us to just add -p to the functions,
but names like `string-match-p' are already taken for variations on the
non-p functions.

In any case, if we happen upon a naming convention that's good, the new
interface for these functions would then be to return a "match object",
that can then be used for looking at details of the match.  I.e.,

(when (setq match (rx-string-match "[a-z]" string))
  (foo)
  (bar (match match 0)))

The match object would know what it had matched, too.  The following
code is an error:

(when (re-search-forward "p[a-z]+" nil t)
  (with-temp-buffer
    (insert (match-string 0))
    (buffer-string)))

But the following would work:

(when (setq match (rx-search-forward "p[a-z]+" nil t))
  (with-temp-buffer
    (insert (match match 0))
    (buffer-string)))

And the same for functions working on strings, of course.  And
equivalent forms for match-beginning/-end.  And we could finally get rid
of the confusingly-named `match-string' function.

There's nothing but upsides, people!

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




reply via email to

[Prev in Thread] Current Thread [Next in Thread]