grep-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Changes to grep/manual/html_node/Basic-vs-Extended.html,v


From: Jim Meyering
Subject: Changes to grep/manual/html_node/Basic-vs-Extended.html,v
Date: Sat, 3 Sep 2022 15:33:15 -0400 (EDT)

CVSROOT:        /webcvs/grep
Module name:    grep
Changes by:     Jim Meyering <meyering> 22/09/03 15:33:15

Index: html_node/Basic-vs-Extended.html
===================================================================
RCS file: /webcvs/grep/grep/manual/html_node/Basic-vs-Extended.html,v
retrieving revision 1.32
retrieving revision 1.33
diff -u -b -r1.32 -r1.33
--- html_node/Basic-vs-Extended.html    14 Aug 2021 20:46:40 -0000      1.32
+++ html_node/Basic-vs-Extended.html    3 Sep 2022 19:33:14 -0000       1.33
@@ -5,7 +5,7 @@
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 <!-- This manual is for grep, a pattern matching engine.
 
-Copyright (C) 1999-2002, 2005, 2008-2021 Free Software Foundation,
+Copyright (C) 1999-2002, 2005, 2008-2022 Free Software Foundation,
 Inc.
 
 Permission is granted to copy, distribute and/or modify this document
@@ -14,10 +14,10 @@
 Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
 Texts.  A copy of the license is included in the section entitled
 "GNU Free Documentation License". -->
-<title>Basic vs Extended (GNU Grep 3.7)</title>
+<title>Basic vs Extended (GNU Grep 3.8)</title>
 
-<meta name="description" content="Basic vs Extended (GNU Grep 3.7)">
-<meta name="keywords" content="Basic vs Extended (GNU Grep 3.7)">
+<meta name="description" content="Basic vs Extended (GNU Grep 3.8)">
+<meta name="keywords" content="Basic vs Extended (GNU Grep 3.8)">
 <meta name="resource-type" content="document">
 <meta name="distribution" content="global">
 <meta name="Generator" content="makeinfo">
@@ -27,7 +27,7 @@
 <link href="Index.html" rel="index" title="Index">
 <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
 <link href="Regular-Expressions.html" rel="up" title="Regular Expressions">
-<link href="Character-Encoding.html" rel="next" title="Character Encoding">
+<link href="Problematic-Expressions.html" rel="next" title="Problematic 
Expressions">
 <link href="Back_002dreferences-and-Subexpressions.html" rel="prev" 
title="Back-references and Subexpressions">
 <style type="text/css">
 <!--
@@ -57,51 +57,37 @@
 <div class="section" id="Basic-vs-Extended">
 <div class="header">
 <p>
-Next: <a href="Character-Encoding.html" accesskey="n" rel="next">Character 
Encoding</a>, Previous: <a href="Back_002dreferences-and-Subexpressions.html" 
accesskey="p" rel="prev">Back-references and Subexpressions</a>, Up: <a 
href="Regular-Expressions.html" accesskey="u" rel="up">Regular Expressions</a> 
&nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>][<a href="Index.html" title="Index" 
rel="index">Index</a>]</p>
+Next: <a href="Problematic-Expressions.html" accesskey="n" 
rel="next">Problematic Regular Expressions</a>, Previous: <a 
href="Back_002dreferences-and-Subexpressions.html" accesskey="p" 
rel="prev">Back-references and Subexpressions</a>, Up: <a 
href="Regular-Expressions.html" accesskey="u" rel="up">Regular Expressions</a> 
&nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" 
rel="contents">Contents</a>][<a href="Index.html" title="Index" 
rel="index">Index</a>]</p>
 </div>
 <hr>
 <span id="Basic-vs-Extended-Regular-Expressions"></span><h3 
class="section">3.6 Basic vs Extended Regular Expressions</h3>
 <span id="index-basic-regular-expressions"></span>
 
-<p>In basic regular expressions the characters &lsquo;<samp>?</samp>&rsquo;, 
&lsquo;<samp>+</samp>&rsquo;,
+<p>Basic regular expressions differ from extended regular expressions
+in the following ways:
+</p>
+<ul>
+<li> The characters &lsquo;<samp>?</samp>&rsquo;, &lsquo;<samp>+</samp>&rsquo;,
 &lsquo;<samp>{</samp>&rsquo;, &lsquo;<samp>|</samp>&rsquo;, 
&lsquo;<samp>(</samp>&rsquo;, and &lsquo;<samp>)</samp>&rsquo; lose their 
special meaning;
 instead use the backslashed versions &lsquo;<samp>\?</samp>&rsquo;, 
&lsquo;<samp>\+</samp>&rsquo;, &lsquo;<samp>\{</samp>&rsquo;,
 &lsquo;<samp>\|</samp>&rsquo;, &lsquo;<samp>\(</samp>&rsquo;, and 
&lsquo;<samp>\)</samp>&rsquo;.  Also, a backslash is needed
-before an interval expression&rsquo;s closing &lsquo;<samp>}</samp>&rsquo;, 
and an unmatched
-<code>\)</code> is invalid.
-</p>
-<p>Portable scripts should avoid the following constructs, as
-POSIX says they produce undefined results:
-</p>
-<ul>
-<li> Extended regular expressions that use back-references.
-</li><li> Basic regular expressions that use &lsquo;<samp>\?</samp>&rsquo;, 
&lsquo;<samp>\+</samp>&rsquo;, or &lsquo;<samp>\|</samp>&rsquo;.
-</li><li> Empty parenthesized regular expressions like 
&lsquo;<samp>()</samp>&rsquo;.
-</li><li> Empty alternatives (as in, e.g, &lsquo;<samp>a|</samp>&rsquo;).
-</li><li> Repetition operators that immediately follow empty expressions,
-unescaped &lsquo;<samp>$</samp>&rsquo;, or other repetition operators.
-</li><li> A backslash escaping an ordinary character (e.g., 
&lsquo;<samp>\S</samp>&rsquo;),
-unless it is a back-reference.
-</li><li> An unescaped &lsquo;<samp>[</samp>&rsquo; that is not part of a 
bracket expression.
-</li><li> In extended regular expressions, an unescaped 
&lsquo;<samp>{</samp>&rsquo; that is not
-part of an interval expression.
+before an interval expression&rsquo;s closing &lsquo;<samp>}</samp>&rsquo;.
+
+</li><li> An unmatched &lsquo;<samp>\)</samp>&rsquo; is invalid.
+
+</li><li> If an unescaped &lsquo;<samp>^</samp>&rsquo; appears neither first, 
nor directly after
+&lsquo;<samp>\(</samp>&rsquo; or &lsquo;<samp>\|</samp>&rsquo;, it is treated 
like an ordinary character and
+is not an anchor.
+
+</li><li> If an unescaped &lsquo;<samp>$</samp>&rsquo; appears neither last, 
nor directly before
+&lsquo;<samp>\|</samp>&rsquo; or &lsquo;<samp>\)</samp>&rsquo;, it is treated 
like an ordinary character and
+is not an anchor.
+
+</li><li> If an unescaped &lsquo;<samp>*</samp>&rsquo; appears first, or 
appears directly after
+&lsquo;<samp>\(</samp>&rsquo; or &lsquo;<samp>\|</samp>&rsquo; or anchoring 
&lsquo;<samp>^</samp>&rsquo;, it is treated like an
+ordinary character and is not a repetition operator.
 </li></ul>
 
-<span id="index-interval-expressions-1"></span>
-<p>Traditional <code>egrep</code> did not support interval expressions and
-some <code>egrep</code> implementations use &lsquo;<samp>\{</samp>&rsquo; and 
&lsquo;<samp>\}</samp>&rsquo; instead, so
-portable scripts should avoid interval expressions in 
&lsquo;<samp>grep&nbsp;-E</samp>&rsquo; patterns
-and should use &lsquo;<samp>[{]</samp>&rsquo; to match a literal 
&lsquo;<samp>{</samp>&rsquo;.
-</p>
-<p>GNU <code>grep&nbsp;-E</code> attempts to support traditional usage by
-assuming that &lsquo;<samp>{</samp>&rsquo; is not special if it would be the 
start of an
-invalid interval expression.
-For example, the command
-&lsquo;<samp>grep&nbsp;-E&nbsp;'{1'</samp>&rsquo; searches for the 
two-character string &lsquo;<samp>{1</samp>&rsquo;
-instead of reporting a syntax error in the regular expression.
-POSIX allows this behavior as an extension, but portable scripts
-should avoid it.
-</p>
 </div>
 
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]