[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Issue in man page ascii.7
From: |
G. Branden Robinson |
Subject: |
Re: Issue in man page ascii.7 |
Date: |
Mon, 5 Dec 2022 02:15:39 -0600 |
Hi Alex & Helge,
At 2022-12-04T13:53:41+0100, Alejandro Colomar wrote:
> On 12/4/22 10:07, Helge Kreutzmann wrote:
> > Without further ado, the following was found:
[...]
> > " 2 3 4 5 6 7 30 40 50 60 70 80 90 100 110 120\n"
> > " ------------- ---------------------------------\n"
> > "0: 0 @ P \\` p 0: ( 2 E<lt> F P Z d n x\n"
> > "1: ! 1 A Q a q 1: ) 3 = G Q [ e o y\n"
> > "2: \" 2 B R b r 2: * 4 E<gt> H R \\e f p z\n"
> > "3: # 3 C S c s 3: ! + 5 ? I S ] g q {\n"
> > "4: $ 4 D T d t 4: \" , 6 @ J T \\(ha h r |\n"
> > "5: % 5 E U e u 5: # - 7 A K U _ i s }\n"
> > "6: & 6 F V f v 6: $ . 8 B L V \\` j t \\(ti\n"
> > "7: \\(aq 7 G W g w 7: % / 9 C M W a k u DEL\n"
> > "8: ( 8 H X h x 8: & 0 : D N X b l v\n"
> > "9: ) 9 I Y i y 9: \\(aq 1 ; E O Y c m w\n"
> >
> > Issue: In the right table, please add \& markup for end of
> > sentence characters (? ! .) to get proper formatting in other
> > locales. Thanks!
>
> I'm not sure what's the intended change.
It would be something like this:
-3: # 3 C S c s 3: ! + 5 ? I S ] g q {\n"
+3: # 3 C S c s 3: !\& + 5 ?\& I S ] g q {\n"
-6: & 6 F V f v 6: $ . 8 B L V \\` j t \\(ti\n"
+6: & 6 F V f v 6: $ .\& 8 B L V \\` j t \\(ti\n"
[...]
> And since it's about formatting, please also CC the following in your
> patch:
>
> CC: "G. Branden Robinson" <g.branden.robinson@gmail.com>
> CC: <groff@gnu.org>
No need in this case; I saw it anyway. :) (Though please To: me as a
rule if my feedback/opinion is explicitly desired.)
> I can do this later, just for your background this was an old bug in
> Debian which manpage-l10n worked around:
>
> https://bugs.debian.org/692765
This can be done (I see now that my suggestion recapitulates one from
Colin Watson, in a ticket trail to which I later contributed), but
what's going on here is actually a GNU tbl(1) bug.
https://savannah.gnu.org/bugs/?61909
"The tbl(1)s from Heirloom Doctools and Unix Version 7 do not supplement
an ordinary table entry with inter-sentence space, but groff's tbl(1)
does.
This seems wrong. This bug is present in groff 1.22.4."
(N.B., the above says *ordinary* table entries. A text block will
witness the power of this fully armed and operational formatter^U
be formatted as was the text prior to the table, which is general means
it will be filled, supplemented with inter-sentence space, automatically
hyphenated, adjusted, and broken.)
I didn't bother to try figuring out how far back the bug dates, but it
seems like the sort that might be 30 years old.
Funnily enough it was just last week that I was whacking on tbl and my
sense of personal irritation overrode my list of groff 1.23 release
goals. So here's the good news.
commit d75543ee567db15b2ac93309a4763401933b8f2c
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
Date: Tue Nov 29 10:05:21 2022 -0600
[tbl]: Fix Savannah #61909.
* src/preproc/tbl/table.cpp (SAVED_INTER_WORD_SPACE_SIZE)
(SAVED_INTER_SENTENCE_SPACE_SIZE): Add new preprocessor macros.
* src/preproc/tbl/table.cpp (SAVED_INTER_WORD_SPACE_SIZE)
(SAVED_INTER_SENTENCE_SPACE_SIZE): Add new preprocessor macros.
(block_entry::do_divert): Restore saved inter-word and inter-sentence
space when formatting a text entry.
(table::init_output): When a table region begions, save the values of
inter-word and inter-sentence space. Add request to the reset macro
to restore saved inter-word and inter-sentence space when leaving
table region.
(table::do_top): Set inter-sentence space to be equal to inter-word
space. This way spaces are "literal" in ordinary table extries (but
not text blocks).
Fixes <https://savannah.gnu.org/bugs/?61909>.
It's possible this was overkill.[1]
However, because 1.23.0 isn't even to RC2 yet, let alone final release,
surely a lot of people will be using old groffs for quite some time. So
it might make sense to patch the page anyway. The dummy character
escape sequences (`\&`) will do no harm.
Regards,
Branden
[1] (groff insider stuff) The preprocessor symbols in the commit above,
and the registers they are used to create, might not be necessary; I
did notice that the "reset" macro for tables is constructed by the
preprocessor by calling a bunch of requests with register arguments
that _lack_ the extra layer of escaping that macro definitions often
use. Since the reset macro is defined _before_ the issue of
requests that change the troff environment, in principle it's not
necessary to use dedicated storage registers, nor a dedicated string
for saving the tab stops as was also done after groff 1.22.4.
However, before trying to optimize this down I have a bigger
question--why the hell doesn't GNU tbl set up an _environent_ for
table entries (and a separate one for text blocks)? As I understand
it, relieving the tedium of manipulating long sequences of formatter
parameters is exactly what environments are _for_. James Clark was
no fool, so I have to speculate that he re-implemented tbl very
early, maybe even before GNU troff itself. That would explain the
indirection of all the tbl-internal register names through
preprocessor macros, since AT&T troff identifiers were limited to
two characters but he surely had longer ones planned. Another
limitation of AT&T troff was that there were only three
environments, so he _couldn't_ add new environments--macro packages
frequently allocated all three to distinct purposes. Maybe at some
point there was a notion of keeping GNU tbl separately usable with
AT&T troff, but if so that notion was abandoned long ago.
And yet no refactoring of GNU tbl to use environments to (1) make
itself, and its output, more comprehensible and (2) reduce the
amount of document expansion when preprocessing by tbl ever took
place. Document expansion by preprocessors is not a concern for
storage or processing speed reasons in 2022, but it remains valuable
for troubleshooting the preprocessors themselves, so that a
developer doesn't have to wade through an ocean of irrelevancies to
locate the spots where bugs lurk. Of the 31 bugs that have _ever_
been filed against GNU tbl in Savannah, 18 of them have been fixed
since the groff 1.22.4 release. (3 are invalid, 1 was fixed in
1.22.4, and the remaining 9 are at large.)
https://savannah.gnu.org/bugs/index.php?go_report=Apply&group=groff&func=browse&set=custom&msort=0&report_id=225&advsrch=0&status_id=0&resolution_id=0&submitted_by=0&assigned_to=0&category_id=109&bug_group_id=0&severity=0&summary=&details=&sumORdet=0&history_search=0&history_field=0&history_event=modified&history_date_dayfd=5&history_date_monthfd=12&history_date_yearfd=2022&chunksz=50&spamscore=5&boxoptionwanted=1
signature.asc
Description: PGP signature
- Re: Issue in man page ascii.7,
G. Branden Robinson <=