help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: I need help with a regular expression


From: Xah Lee
Subject: Re: I need help with a regular expression
Date: Thu, 6 May 2010 03:17:16 -0700 (PDT)
User-agent: G2/1.0

On May 5, 8:51 am, da...@adboyd.com (J. David Boyd) wrote:
> Cecil Westerhof <Ce...@decebal.nl> writes:
> > I have written some code to count the number of functions in a buffer.
> > At the moment I use the following regular expression for this:
> >     "^(defun "
>
> > This works fine, but then the defun's have to be on the start of the
> > line. This is the most logical, but it is better to be save as sorry.
> > This is why I wanted to write a more robust regular expression. I was
> > thinking about something like:
> >     "^[^;]+(defun "
>
> > But that does not work. It marks the following completely, instead of
> > the three at its own:
> >     (defun a () (message "a"))
> >     (defun b () (message "b"))
> >     (defun c () (message "c"))
>
> > Why is this? And how can I make a regular expression that does what I
> > want?
>
> There's a book that explains this, sorry but I can't remember the name
> of it, something to do, of course, with "Regular Expressions".
>
> The problem is that the expression you gave it, is, as the author
> explains, "hungry".
>
> It tries to match as much as possible, not as least as possible.
>
> In your case, it sees the '^' (start of line, then looks as far as
> possible for the 'defun'.  
>
> You did have a blank line after your last defun, right?  Otherwise, it
> would have kept on going.
>
> Go to O'Reilly, and hunt for books on regular expressions.  It was only
> a few 100 pages, good price, and explained a great deal.
>
> Good luck!

i read the first edition (1997) in 1999.
(see perl book reviews here http://xahlee.org/UnixResource_dir/perlr.html
)

Last i looked, the 3rd edition in 2006, they dropped coverage on emacs
regex.

in general, i don't recommend the book, unless your do regex research.
Regex is useful for matching simple words or phrases. When your need
for pattern match text is slightly more complicated than phrases,
regex quickly become not useful.

I've also came across a page that heavily criticize the book, citing
many errors, and showing another regex engine that's way more faster.
(i haven't verified it or read it in depth)

http://swtch.com/~rsc/regexp/regexp1.html

quote:
«Finally, any discussion of regular expressions would be incomplete
without mentioning Jeffrey Friedl's book Mastering Regular
Expressions, perhaps the most popular reference among today's
programmers. Friedl's book teaches programmers how best to use today's
regular expression implementations, but not how best to implement
them. What little text it devotes to implementation issues perpetuates
the widespread belief that recursive backtracking is the only way to
simulate an NFA. Friedl makes it clear that he neither understands nor
respects the underlying theory.
»

also, today there's lots tools for text pattern matching. One i
recommend is Parsing Expression Grammar. There are 2 emacs
implementation (on emacswiki.org), but both are hard to use and lack
much documentation. (the “regular expression” we know today since unix
grep of 1990s or earlier, is derived by happenstance from 4 decade old
theory on parsing, based on then so-called theories of so-called
automata)

for your need, i just recommend reading the emacs info page on its
regex in detail.

• Text Pattern Matching in Emacs (emacs regex tutorial)
  http://xahlee.org/emacs/emacs_regex.html

• Regular Expressions - GNU Emacs Lisp Reference Manual
  http://xahlee.org/elisp/Regular-Expressions.html

for some more opinions on regex, pattern matching, parsing, see:

• Pattern Matching vs Lexical Grammar Specification
  http://xahlee.org/cmaci/notation/pattern_matching_vs_pattern_spec.html

  Xah
∑ http://xahlee.org/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]