groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 09/80: [troff]: Allow more delimiters in compat mode.


From: G. Branden Robinson
Subject: [groff] 09/80: [troff]: Allow more delimiters in compat mode.
Date: Sat, 30 Nov 2024 04:02:12 -0500 (EST)

gbranden pushed a commit to branch master
in repository groff.

commit fb6fdbd9d174262ab57572bf095100459b589a66
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Tue Nov 26 15:06:20 2024 -0600

    [troff]: Allow more delimiters in compat mode.
    
    * src/roff/troff/input.cpp (token::is_usable_as_delimiter): If in
      compatibility mode, accept any ordinary character as a delimiter.
    
    * doc/groff.texi.in (Delimiters, Compatibility Mode):
    * man/groff_diff.7.man (Compatibility mode): Document it.
    
    * src/roff/groff/tests/check-delimiter-validity.sh: Test it.
    * src/roff/groff/groff.am (groff_TESTS): Run test.
    
    * NEWS: Add item.
---
 ChangeLog                                        | 15 ++++++
 NEWS                                             |  5 ++
 doc/groff.texi.in                                | 42 ++++++++++++++-
 man/groff_diff.7.man                             | 35 +++++++++++++
 src/roff/groff/groff.am                          |  1 +
 src/roff/groff/tests/check-delimiter-validity.sh | 66 ++++++++++++++++++++++++
 src/roff/troff/input.cpp                         | 16 +++---
 7 files changed, 168 insertions(+), 12 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 632af271d..19b3be599 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,18 @@
+2024-11-25  G. Branden Robinson <g.branden.robinson@gmail.com>
+
+       * src/roff/troff/input.cpp (token::is_usable_as_delimiter): If
+       in compatibility mode, accept any ordinary character as a
+       delimiter.
+
+       * doc/groff.texi.in (Delimiters, Compatibility Mode):
+       * man/groff_diff.7.man (Compatibility mode): Document it.
+
+       * src/roff/groff/tests/\
+       allow-wacky-delimiters-in-compatibility-mode.sh: Test it.
+       * src/roff/groff/groff.am (groff_TESTS): Run test.
+
+       * NEWS: Add item.
+
 2024-11-25  G. Branden Robinson <g.branden.robinson@gmail.com>
 
        * src/roff/troff/input.cpp (is_char_usable_as_delimiter):
diff --git a/NEWS b/NEWS
index eda984064..2ea6b50b9 100644
--- a/NEWS
+++ b/NEWS
@@ -168,6 +168,11 @@ troff
    units, such as the `ps` request and `\s` escape sequence, now also
    accept `p` and `s`.
 
+*  In compatibility mode, GNU troff now accepts delimiters that it
+   rejects when not in compatbility mode--namely, ordinary characters
+   that are meaningful in numeric expressions (which are often
+   delimited).  This change improves compatibilty with AT&T troff.
+
 eqn
 ---
 
diff --git a/doc/groff.texi.in b/doc/groff.texi.in
index 97ac73216..34648e739 100644
--- a/doc/groff.texi.in
+++ b/doc/groff.texi.in
@@ -7304,10 +7304,14 @@ the numerals @code{0}-@code{9} and the decimal point 
@code{.}
 @ifinfo
 @cindex <colon>, as delimiter
 @end ifinfo
-@cindex @code{|}, as delimiter
+@c @cindex @code{|}, as delimiter
 @cindex @code{(}, as delimiter
 @cindex @code{)}, as delimiter
-the (single-character) operators @samp{+-/*%<>=&:()|}
+the (single-character) operators @samp{+-/*%<>=&:()}@footnote{GNU
+@command{troff} accepts @samp{|} as a delimiter in spite of its
+meaningfulness in numeric expressions because it occasionally sees use
+in man pages.  Future @code{groff} releases may deprecate and
+subsequently withdraw such support.}
 
 @item
 @cindex space character, as delimiter
@@ -17760,6 +17764,40 @@ name space, so choose a register name that is unlikely 
to collide with
 other uses.
 @endDefreq
 
+@cindex additional delimiters accepted by @acronym{AT&T} @code{troff}
+@cindex delimiters, additional, accepted by @acronym{AT&T} @code{troff}
+In compatibility mode,
+GNU
+@command{troff}
+accepts several characters as delimiters that it ordinarily rejects,
+because they are meaningful in numeric expressions and therefore
+potentially ambiguous to the document maintainer.
+The set of additional delimiters is
+@samp{0}
+@samp{1}
+@samp{2}
+@samp{3}
+@samp{4}
+@samp{5}
+@samp{6}
+@samp{7}
+@samp{8}
+@samp{9}
+@samp{+}
+@samp{-}
+@samp{/}
+@samp{*}
+@samp{%}
+@samp{<}
+@samp{>}
+@samp{=}
+@samp{&}
+@samp{:}
+@samp{(}
+@samp{)},
+and
+@samp{.}.
+
 @cindex input level in delimited arguments
 @cindex interpolation depth in delimited arguments
 @cindex delimited arguments, incompatibilities with @acronym{AT&T} @code{troff}
diff --git a/man/groff_diff.7.man b/man/groff_diff.7.man
index df09e8950..a450e089a 100644
--- a/man/groff_diff.7.man
+++ b/man/groff_diff.7.man
@@ -5692,6 +5692,41 @@ turns compatibility mode
 while it interprets its argument list.
 .
 .
+.P
+In compatibility mode,
+GNU
+.\" troff \" GNU
+accepts several characters as delimiters that it ordinarily rejects,
+because they are meaningful in numeric expressions and therefore
+potentially ambiguous to the document maintainer.
+.
+The set of additional delimiters is
+\[lq]0\[rq]
+\[lq]1\[rq]
+\[lq]2\[rq]
+\[lq]3\[rq]
+\[lq]4\[rq]
+\[lq]5\[rq]
+\[lq]6\[rq]
+\[lq]7\[rq]
+\[lq]8\[rq]
+\[lq]9\[rq]
+\[lq]+\[rq]
+\[lq]\-\[rq]
+\[lq]/\[rq]
+\[lq]*\[rq]
+\[lq]%\[rq]
+\[lq]<\[rq]
+\[lq]>\[rq]
+\[lq]=\[rq]
+\[lq]&\[rq]
+\[lq]:\[rq]
+\[lq](\[rq]
+\[lq])\[rq],
+and
+\[lq].\[rq].
+.
+.
 .\" ====================================================================
 .SH "Other differences"
 .\" ====================================================================
diff --git a/src/roff/groff/groff.am b/src/roff/groff/groff.am
index 47fcd8116..27e1809cf 100644
--- a/src/roff/groff/groff.am
+++ b/src/roff/groff/groff.am
@@ -44,6 +44,7 @@ groff_TESTS = \
   src/roff/groff/tests/backslash-s-works-with-single-digit-argument.sh \
   src/roff/groff/tests/break-zero-length-output-line-sanely.sh \
   src/roff/groff/tests/cf-request-early-does-not-fail.sh \
+  src/roff/groff/tests/check-delimiter-validity.sh \
   src/roff/groff/tests/current-language-and-environment-in-sync.sh \
   src/roff/groff/tests/degenerate-control-flow-works.sh \
   src/roff/groff/tests/detect-evil-link-time-optimizer.sh \
diff --git a/src/roff/groff/tests/check-delimiter-validity.sh 
b/src/roff/groff/tests/check-delimiter-validity.sh
new file mode 100755
index 000000000..82b4a74ae
--- /dev/null
+++ b/src/roff/groff/tests/check-delimiter-validity.sh
@@ -0,0 +1,66 @@
+#!/bin/sh
+#
+# Copyright (C) 2024 Free Software Foundation, Inc.
+#
+# This file is part of groff.
+#
+# groff is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# groff is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+# for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+groff="${abs_top_builddir:-.}/test-groff"
+
+fail=
+
+wail () {
+  echo ...FAILED >&2
+  fail=YES
+}
+
+for c in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z \
+         a b c d e f g h i j k l m n o p q r s t u v w x y z '|'
+do
+  echo "checking validity of '$c' as delimiter in normal mode" \
+    >&2
+  output=$(printf '\\l%c1n+2n\\&_%c\n' "$c" "$c" \
+    | "$groff" -T ascii | sed '/^$/d')
+  echo "$output"
+  echo "$output" | grep -Fqx ___ || wail
+done
+
+for c in 0 1 2 3 4 5 6 7 8 9 + - / '*' % '<' '>' = '&' : '(' ')' .
+do
+  echo "checking invalidity of '$c' as delimiter in normal mode" \
+    >&2
+  output=$(printf '\\l%c1n+2n\\&_%c\n' "$c" "$c" \
+    | "$groff" -T ascii | sed '/^$/d')
+  echo "$output"
+  echo "$output" | grep -qx 1n+2n_. || wail
+done
+
+# All of these work in DWB nroff.
+for c in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z \
+         a b c d e f g h i j k l m n o p q r s t u v w x y z \
+         0 1 2 3 4 5 6 7 8 9 + - / '*' % '<' '>' = '&' : '(' ')' . '|'
+do
+  echo "checking validity of '$c' as delimiter in $mode mode" \
+    >&2
+  output=$(printf '\\l%c1n+2n\\&_%c\n' "$c" "$c" \
+    | "$groff" -C -T ascii | sed '/^$/d')
+  echo "$output"
+  echo "$output" | grep -Fqx ___ || wail
+done
+
+test -z "$fail"
+
+# vim:set autoindent expandtab shiftwidth=2 tabstop=2 textwidth=72:
diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp
index 38b26f1e4..474260a10 100644
--- a/src/roff/troff/input.cpp
+++ b/src/roff/troff/input.cpp
@@ -2585,17 +2585,9 @@ bool token::operator!=(const token &t)
 // doesn't tokenize it) and accepts a user-specified delimiter.
 static bool is_char_usable_as_delimiter(int c)
 {
+  if (csdigit(c))
+    return false;
   switch (c) {
-  case '0':
-  case '1':
-  case '2':
-  case '3':
-  case '4':
-  case '5':
-  case '6':
-  case '7':
-  case '8':
-  case '9':
   case '+':
   case '-':
   case '/':
@@ -2626,6 +2618,10 @@ bool token::is_usable_as_delimiter(bool report_error)
   bool is_valid = false;
   switch (type) {
   case TOKEN_CHAR:
+    // AT&T troff accepted any character as a delimiter, even perverse
+    // choices like `\l91n+2n\&*`.  See Savannah #66481.
+    if (want_att_compat)
+      return true;
     is_valid = is_char_usable_as_delimiter(c);
     if (!is_valid && report_error)
       error("character '%1' is not allowed as a delimiter",



reply via email to

[Prev in Thread] Current Thread [Next in Thread]