bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

sed: extend documentation of extended regular expressions by '|'


From: Gerald Pfeifer
Subject: sed: extend documentation of extended regular expressions by '|'
Date: Sun, 31 Oct 2010 22:52:39 +0100 (CET)
User-agent: Alpine 2.00 (LNX 1167 2008-08-23)

Hi Paolo,

the info page of GNU sed 4.1.5 states

   The only difference between basic and extended regular expressions
   is in the behavior of a few characters: `?', `+', parentheses, and
   braces (`{}').  While basic regular expressions require these to be
   escaped if you want them to behave as special characters, when using
   extended regular expressions you must escape them if you want them _to
   match a literal character_.

I believe this is not complete.  Specifically, looking at 

  % echo 'abac' | sed   -e 's#a|b#x#g'
  abac
  % echo 'abac' | sed  -re 's#a|b#x#g'
  xxxc

or

  % echo 'aba|b' | sed   -e 's#a|b#x#g'
  abx
  % echo 'aba|b' | sed  -re 's#a|b#x#g'
  xxx|x

it occurs to me that basic regular expressions treat '|' as a verbatim 
character whereas extended regular expressions treat '|' to indicate 
alternatives, so '|' should be included in that documentation as well.
Except that it's a bit more tricky since "\|" actually is a GNU
extension. ;-)

How about the patch below?

Gerald

2010-10-31  Gerald Pfeifer  <address@hidden>

        * doc/sed-in.texi (Extended regexps): Add '|' to the list of
        differences.  Note that "\|" is a GNU extension to begin with.

 
diff --git a/doc/sed-in.texi b/doc/sed-in.texi
index 00874ba..494e832 100644
--- a/doc/sed-in.texi
+++ b/doc/sed-in.texi
@@ -2824,10 +2824,12 @@ the @env{LC_COLLATE} and @env{LC_CTYPE} environment 
variables to @samp{C}.
 
 The only difference between basic and extended regular expressions is in
 the behavior of a few characters: @samp{?}, @samp{+}, parentheses,
-and braces (@address@hidden@}}).  While basic regular expressions require
-these to be escaped if you want them to behave as special characters,
-when using extended regular expressions you must escape them if
-you want them @emph{to match a literal character}.
+braces (@address@hidden@}}), and @samp{|}.  While basic regular expressions
+require these to be escaped if you want them to behave as special
+characters, when using extended regular expressions you must escape
+them if you want them @emph{to match a literal character}.  @samp{|}
+is special here because @samp{\|} is a GNU extension -- standard
+basic regular expressions do not provide its functionality.
 
 @noindent
 Examples:



reply via email to

[Prev in Thread] Current Thread [Next in Thread]