From 94d8eeeff4ae99cb12718dab7cf7fdc52de77b6e Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Mon, 19 Jun 2023 11:09:00 -0700 Subject: [PATCH 3/3] =?UTF-8?q?Call=20them=20=E2=80=9Cbracket=20expression?= =?UTF-8?q?s=E2=80=9D=20more=20consistently?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Emacs comments and doc were inconsistent about the name used for regexps like [a-z]. Sometimes it called them “character alternatives”, sometimes “character sets”, sometimes “bracket expressions”. Prefer “bracket expressions” as it is less confusing: POSIX and most other programs’ doc uses “bracket expressions”, “alternative” is also used in the Emacs documentation to talk about ...\|... in regexps, and “character set” normally has a different meaning in Emacs. --- doc/emacs/search.texi | 12 +++--- doc/lispref/searching.texi | 74 ++++++++++++++++++------------------ lisp/emacs-lisp/lisp-mode.el | 2 +- lisp/textmodes/picture.el | 2 +- 4 files changed, 45 insertions(+), 45 deletions(-) diff --git a/doc/emacs/search.texi b/doc/emacs/search.texi index 45378d95f65..2a816221235 100644 --- a/doc/emacs/search.texi +++ b/doc/emacs/search.texi @@ -950,8 +950,8 @@ Regexps @dfn{special constructs} and the rest are @dfn{ordinary}. An ordinary character matches that same character and nothing else. The special characters are @samp{$^.*+?[\}. The character @samp{]} is special if -it ends a character alternative (see below). The character @samp{-} -is special inside a character alternative. Any other character +it ends a bracket expression (see below). The character @samp{-} +is special inside a bracket expression. Any other character appearing in a regular expression is ordinary, unless a @samp{\} precedes it. (When you use regular expressions in a Lisp program, each @samp{\} must be doubled, see the example near the end of this @@ -1033,11 +1033,11 @@ Regexps a newline, it matches the whole string. Since it @emph{can} match starting at the first @samp{a}, it does. +@cindex bracket expression @cindex set of alternative characters, in regular expressions @cindex character set, in regular expressions @item @kbd{[ @dots{} ]} -is a @dfn{set of alternative characters}, or a @dfn{character set}, -beginning with @samp{[} and terminated by @samp{]}. +is a @dfn{bracket expression}, which matches one of a set of characters. In the simplest case, the characters between the two brackets are what this set can match. Thus, @samp{[ad]} matches either one @samp{a} or @@ -1057,7 +1057,7 @@ Regexps @cindex character classes, in regular expressions You can also include certain special @dfn{character classes} in a character set. A @samp{[:} and balancing @samp{:]} enclose a -character class inside a set of alternative characters. For instance, +character class inside a bracket expression. For instance, @samp{[[:alnum:]]} matches any letter or digit. @xref{Char Classes,,, elisp, The Emacs Lisp Reference Manual}, for a list of character classes. @@ -1125,7 +1125,7 @@ Regexps to depend on this behavior; it is better to quote the special character anyway, regardless of where it appears. -As a @samp{\} is not special inside a set of alternative characters, it can +As a @samp{\} is not special inside a bracket expression, it can never remove the special meaning of @samp{-}, @samp{^} or @samp{]}. You should not quote these characters when they have no special meaning. This would not clarify anything, since backslashes diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi index 608abae762c..28230cea643 100644 --- a/doc/lispref/searching.texi +++ b/doc/lispref/searching.texi @@ -278,10 +278,10 @@ Syntax of Regexps and nothing else. The special characters are @samp{.}, @samp{*}, @samp{+}, @samp{?}, @samp{[}, @samp{^}, @samp{$}, and @samp{\}; no new special characters will be defined in the future. The character -@samp{]} is special if it ends a character alternative (see later). -The character @samp{-} is special inside a character alternative. A +@samp{]} is special if it ends a bracket expression (see later). +The character @samp{-} is special inside a bracket expression. A @samp{[:} and balancing @samp{:]} enclose a character class inside a -character alternative. Any other character appearing in a regular +bracket expression. Any other character appearing in a regular expression is ordinary, unless a @samp{\} precedes it. For example, @samp{f} is not a special character, so it is ordinary, and @@ -374,19 +374,19 @@ Regexp Special permits the whole expression to match is @samp{d}.) @item @samp{[ @dots{} ]} -@cindex character alternative (in regexp) +@cindex bracket expression (in regexp) @cindex @samp{[} in regexp @cindex @samp{]} in regexp -is a @dfn{character alternative}, which begins with @samp{[} and is +is a @dfn{bracket expression}, which begins with @samp{[} and is terminated by @samp{]}. In the simplest case, the characters between -the two brackets are what this character alternative can match. +the two brackets are what this bracket expression can match. Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and @samp{[ad]*} matches any string composed of just @samp{a}s and @samp{d}s (including the empty string). It follows that @samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr}, @samp{caddaar}, etc. -You can also include character ranges in a character alternative, by +You can also include character ranges in a bracket expression, by writing the starting and ending characters with a @samp{-} between them. Thus, @samp{[a-z]} matches any lower-case @acronym{ASCII} letter. Ranges may be intermixed freely with individual characters, as in @@ -395,7 +395,7 @@ Regexp Special range should not be the starting point of another one; for example, @samp{[a-m-z]} should be avoided. -A character alternative can also specify named character classes +A bracket expression can also specify named character classes (@pxref{Char Classes}). For example, @samp{[[:ascii:]]} matches any @acronym{ASCII} character. Using a character class is equivalent to mentioning each of the characters in that class; but the latter is not @@ -404,9 +404,9 @@ Regexp Special lower or upper bound of a range. The usual regexp special characters are not special inside a -character alternative. A completely different set of characters is +bracket expression. A completely different set of characters is special: @samp{]}, @samp{-} and @samp{^}. -To include @samp{]} in a character alternative, put it at the +To include @samp{]} in a bracket expression, put it at the beginning. To include @samp{^}, put it anywhere but at the beginning. To include @samp{-}, put it at the end. Thus, @samp{[]^-]} matches all three of these special characters. You cannot use @samp{\} to @@ -444,7 +444,7 @@ Regexp Special feature is intended for searching text in unibyte buffers and strings. @end enumerate -Some kinds of character alternatives are not the best style even +Some kinds of bracket expressions are not the best style even though they have a well-defined meaning in Emacs. They include: @enumerate @@ -458,7 +458,7 @@ Regexp Special @samp{[ก-ฺ฿-๛]} is less clear than @samp{[\u0E01-\u0E3A\u0E3F-\u0E5B]}. @item -Although a character alternative can include duplicates, it is better +Although a bracket expression can include duplicates, it is better style to avoid them. For example, @samp{[XYa-yYb-zX]} is less clear than @samp{[XYa-z]}. @@ -469,30 +469,30 @@ Regexp Special than @samp{[ij]}, and @samp{[i-k]} is less clear than @samp{[ijk]}. @item -Although a @samp{-} can appear at the beginning of a character -alternative or as the upper bound of a range, it is better style to -put @samp{-} by itself at the end of a character alternative. For +Although a @samp{-} can appear at the beginning of a bracket +expression or as the upper bound of a range, it is better style to +put @samp{-} by itself at the end of a bracket expression. For example, although @samp{[-a-z]} is valid, @samp{[a-z-]} is better style; and although @samp{[*--]} is valid, @samp{[*+,-]} is clearer. @end enumerate @item @samp{[^ @dots{} ]} @cindex @samp{^} in regexp -@samp{[^} begins a @dfn{complemented character alternative}. This +@samp{[^} begins a @dfn{complemented bracket expression}. This matches any character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches all characters @emph{except} ASCII letters and digits. -@samp{^} is not special in a character alternative unless it is the first +@samp{^} is not special in a bracket expression unless it is the first character. The character following the @samp{^} is treated as if it were first (in other words, @samp{-} and @samp{]} are not special there). -A complemented character alternative can match a newline, unless newline is +A complemented bracket expression can match a newline, unless newline is mentioned as one of the characters not to match. This is in contrast to the handling of regexps in programs such as @code{grep}. -You can specify named character classes, just like in character -alternatives. For instance, @samp{[^[:ascii:]]} matches any +You can specify named character classes, just like in bracket +expressions. For instance, @samp{[^[:ascii:]]} matches any non-@acronym{ASCII} character. @xref{Char Classes}. @item @samp{^} @@ -556,7 +556,7 @@ Regexp Special For example, it is unwise to use @samp{\b*}, which can be omitted without changing the documented meaning of the regular expression. -As a @samp{\} is not special inside a character alternative, it can +As a @samp{\} is not special inside a bracket expression, it can never remove the special meaning of @samp{-}, @samp{^} or @samp{]}. You should not quote these characters when they have no special meaning. This would not clarify anything, since backslashes @@ -565,23 +565,23 @@ Regexp Special syntax), which matches any single character except a backslash. In practice, most @samp{]} that occur in regular expressions close a -character alternative and hence are special. However, occasionally a +bracket expression and hence are special. However, occasionally a regular expression may try to match a complex pattern of literal @samp{[} and @samp{]}. In such situations, it sometimes may be necessary to carefully parse the regexp from the start to determine -which square brackets enclose a character alternative. For example, -@samp{[^][]]} consists of the complemented character alternative +which square brackets enclose a bracket expression. For example, +@samp{[^][]]} consists of the complemented bracket expression @samp{[^][]} (which matches any single character that is not a square bracket), followed by a literal @samp{]}. The exact rules are that at the beginning of a regexp, @samp{[} is special and @samp{]} not. This lasts until the first unquoted -@samp{[}, after which we are in a character alternative; @samp{[} is +@samp{[}, after which we are in a bracket expression; @samp{[} is no longer special (except when it starts a character class) but @samp{]} is special, unless it immediately follows the special @samp{[} or that @samp{[} followed by a @samp{^}. This lasts until the next special -@samp{]} that does not end a character class. This ends the character -alternative and restores the ordinary syntax of regular expressions; +@samp{]} that does not end a character class. This ends the bracket +expression and restores the ordinary syntax of regular expressions; an unquoted @samp{[} is special again and a @samp{]} not. @node Char Classes @@ -592,8 +592,8 @@ Char Classes @cindex alpha character class, regexp @cindex xdigit character class, regexp - Below is a table of the classes you can use in a character -alternative, and what they mean. Note that the @samp{[} and @samp{]} + Below is a table of the classes you can use in a bracket +expression, and what they mean. Note that the @samp{[} and @samp{]} characters that enclose the class name are part of the name, so a regular expression using these classes needs one more pair of brackets. For example, a regular expression matching a sequence of @@ -920,7 +920,7 @@ Regexp Backslash @kindex invalid-regexp Not every string is a valid regular expression. For example, a string -that ends inside a character alternative without a terminating @samp{]} +that ends inside a bracket expression without a terminating @samp{]} is invalid, and so is a string that ends with a single @samp{\}. If an invalid regular expression is passed to any of the search functions, an @code{invalid-regexp} error is signaled. @@ -957,7 +957,7 @@ Regexp Example @table @code @item [.?!] -The first part of the pattern is a character alternative that matches +The first part of the pattern is a bracket expression that matches any one of three characters: period, question mark, and exclamation mark. The match must begin with one of these three characters. (This is one point where the new default regexp used by Emacs differs from @@ -969,7 +969,7 @@ Regexp Example marks, zero or more of them, that may follow the period, question mark or exclamation mark. The @code{\"} is Lisp syntax for a double-quote in a string. The @samp{*} at the end indicates that the immediately -preceding regular expression (a character alternative, in this case) may be +preceding regular expression (a bracket expression, in this case) may be repeated zero or more times. @item \\($\\|@ $\\|\t\\|@ @ \\) @@ -1920,7 +1920,7 @@ Regexp Problems causing a match to fail early. @item -Avoid or-patterns in favor of character alternatives: write +Avoid or-patterns in favor of bracket expressions: write @samp{[ab]} instead of @samp{a\|b}. Recall that @samp{\s-} and @samp{\sw} are equivalent to @samp{[[:space:]]} and @samp{[[:word:]]}, respectively. @@ -3012,7 +3012,7 @@ POSIX Regexps @item In POSIX BREs, it is an implementation option whether @samp{^} is special after @samp{\(}; GNU @command{grep} treats it like Emacs does. -In POSIX EREs, @samp{^} is always special outside of character alternatives, +In POSIX EREs, @samp{^} is always special outside of bracket expressions, which means the ERE @samp{x^} never matches. In Emacs regular expressions, @samp{^} is special only at the beginning of the regular expression, or after @samp{\(}, @samp{\(?:} @@ -3021,7 +3021,7 @@ POSIX Regexps @item In POSIX BREs, it is an implementation option whether @samp{$} is special before @samp{\)}; GNU @command{grep} treats it like Emacs does. -In POSIX EREs, @samp{$} is always special outside of character alternatives, +In POSIX EREs, @samp{$} is always special outside of bracket expressions, which means the ERE @samp{$x} never matches. In Emacs regular expressions, @samp{$} is special only at the end of the regular expression, or before @samp{\)} or @samp{\|}. @@ -3049,8 +3049,8 @@ POSIX Regexps @samp{[:nonascii:]}, @samp{[:unibyte:]}, and @samp{[:word:]}. @item -BRE and ERE alternatives can contain collating symbols and equivalence -class expressions, e.g., @samp{[[.ch.]d[=a=]]}. +BREs and EREs can contain collating symbols and equivalence +class expressions within bracket expressions, e.g., @samp{[[.ch.]d[=a=]]}. Emacs regular expressions do not support this. @item diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el index 9914ededb85..1990630608d 100644 --- a/lisp/emacs-lisp/lisp-mode.el +++ b/lisp/emacs-lisp/lisp-mode.el @@ -1453,7 +1453,7 @@ lisp-fill-paragraph ;; are buffer-local, but we avoid changing them so that they can be set ;; to make `forward-paragraph' and friends do something the user wants. ;; - ;; `paragraph-start': The `(' in the character alternative and the + ;; `paragraph-start': The `(' in the bracket expression and the ;; left-singlequote plus `(' sequence after the \\| alternative prevent ;; sexps and backquoted sexps that follow a docstring from being filled ;; with the docstring. This setting has the consequence of inhibiting diff --git a/lisp/textmodes/picture.el b/lisp/textmodes/picture.el index 9aa9b72c513..f98c3963b6f 100644 --- a/lisp/textmodes/picture.el +++ b/lisp/textmodes/picture.el @@ -383,7 +383,7 @@ picture-tab-chars The syntax for this variable is like the syntax used inside of `[...]' in a regular expression--but without the `[' and the `]'. It is NOT a regular expression, and should follow the usual -rules for the contents of a character alternative. +rules for the contents of a bracket expression. It defines a set of \"interesting characters\" to look for when setting \(or searching for) tab stops, initially \"!-~\" (all printing characters). For example, suppose that you are editing a table which is formatted thus: -- 2.39.2