grep-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

grep branch, master, updated. v3.7-17-gf0d97db


From: Paul Eggert
Subject: grep branch, master, updated. v3.7-17-gf0d97db
Date: Fri, 27 Aug 2021 21:21:27 -0400 (EDT)

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "grep".

The branch, master has been updated
       via  f0d97db2a2104c5fd558178713054f3f267623b2 (commit)
       via  fd72f5d2c2a9a6a220e98af1c0230f1ae6e0a8d2 (commit)
      from  e3694e90b4789ccafaf022a29d9ce08ff11375c2 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://git.savannah.gnu.org/cgit/grep.git/commit/?id=f0d97db2a2104c5fd558178713054f3f267623b2


commit f0d97db2a2104c5fd558178713054f3f267623b2
Author: Paul Eggert <eggert@cs.ucla.edu>
Date:   Fri Aug 27 18:20:58 2021 -0700

    doc: document interval expression limitations
    
    * doc/grep.texi (Basic vs Extended, Performance):
    Document limitations of interval expressions (Bug#44538).

diff --git a/doc/grep.texi b/doc/grep.texi
index b92ecb7..e5b9fd8 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -1526,7 +1526,7 @@ before an interval expression's closing @samp{@}}, and an 
unmatched
 @code{\)} is invalid.
 
 Portable scripts should avoid the following constructs, as
-POSIX says they produce undefined results:
+POSIX says they produce unspecified results:
 
 @itemize @bullet
 @item
@@ -1541,6 +1541,8 @@ Empty alternatives (as in, e.g, @samp{a|}).
 Repetition operators that immediately follow empty expressions,
 unescaped @samp{$}, or other repetition operators.
 @item
+Interval expressions containing repetition counts greater than 255.
+@item
 A backslash escaping an ordinary character (e.g., @samp{\S}),
 unless it is a back-reference.
 @item
@@ -1965,6 +1967,17 @@ bracket expressions like @samp{[a-z]} and 
@samp{[[=a=]b]}, can be
 surprisingly inefficient due to difficulties in fast portable access to
 concepts like multi-character collating elements.
 
+@cindex interval expressions
+Interval expressions may be implemented internally via repetition.
+For example, @samp{^(a|bc)@{2,4@}$} might be implemented as
+@samp{^(a|bc)(a|bc)((a|bc)(a|bc)?)?$}.  A large repetition count may
+exhaust memory or greatly slow matching.  Even small counts can cause
+problems if cascaded; for example, @samp{grep -E
+".*@{10,@}@{10,@}@{10,@}@{10,@}@{10,@}"} is likely to overflow a
+stack.  Fortunately, regular expressions like these are typically
+artificial, and cascaded repetitions do not conform to POSIX so cannot
+be used in portable programs anyway.
+
 @cindex back-references
 A back-reference such as @samp{\1} can hurt performance significantly
 in some cases, since back-references cannot in general be implemented

http://git.savannah.gnu.org/cgit/grep.git/commit/?id=fd72f5d2c2a9a6a220e98af1c0230f1ae6e0a8d2


commit f0d97db2a2104c5fd558178713054f3f267623b2
Author: Paul Eggert <eggert@cs.ucla.edu>
Date:   Fri Aug 27 18:20:58 2021 -0700

    doc: document interval expression limitations
    
    * doc/grep.texi (Basic vs Extended, Performance):
    Document limitations of interval expressions (Bug#44538).

diff --git a/doc/grep.texi b/doc/grep.texi
index b92ecb7..e5b9fd8 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -1526,7 +1526,7 @@ before an interval expression's closing @samp{@}}, and an 
unmatched
 @code{\)} is invalid.
 
 Portable scripts should avoid the following constructs, as
-POSIX says they produce undefined results:
+POSIX says they produce unspecified results:
 
 @itemize @bullet
 @item
@@ -1541,6 +1541,8 @@ Empty alternatives (as in, e.g, @samp{a|}).
 Repetition operators that immediately follow empty expressions,
 unescaped @samp{$}, or other repetition operators.
 @item
+Interval expressions containing repetition counts greater than 255.
+@item
 A backslash escaping an ordinary character (e.g., @samp{\S}),
 unless it is a back-reference.
 @item
@@ -1965,6 +1967,17 @@ bracket expressions like @samp{[a-z]} and 
@samp{[[=a=]b]}, can be
 surprisingly inefficient due to difficulties in fast portable access to
 concepts like multi-character collating elements.
 
+@cindex interval expressions
+Interval expressions may be implemented internally via repetition.
+For example, @samp{^(a|bc)@{2,4@}$} might be implemented as
+@samp{^(a|bc)(a|bc)((a|bc)(a|bc)?)?$}.  A large repetition count may
+exhaust memory or greatly slow matching.  Even small counts can cause
+problems if cascaded; for example, @samp{grep -E
+".*@{10,@}@{10,@}@{10,@}@{10,@}@{10,@}"} is likely to overflow a
+stack.  Fortunately, regular expressions like these are typically
+artificial, and cascaded repetitions do not conform to POSIX so cannot
+be used in portable programs anyway.
+
 @cindex back-references
 A back-reference such as @samp{\1} can hurt performance significantly
 in some cases, since back-references cannot in general be implemented

-----------------------------------------------------------------------

Summary of changes:
 doc/grep.texi | 15 ++++++++++++++-
 gnulib        |  2 +-
 src/system.h  |  4 ++--
 3 files changed, 17 insertions(+), 4 deletions(-)


hooks/post-receive
-- 
grep



reply via email to

[Prev in Thread] Current Thread [Next in Thread]