bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] repair recent, ill-conceived man page changes


From: G. Branden Robinson
Subject: Re: [PATCH] repair recent, ill-conceived man page changes
Date: Wed, 11 Oct 2023 11:54:31 -0500

Hi Chet,

At 2023-10-11T10:22:44-0400, Chet Ramey wrote:
> On 10/11/23 5:08 AM, G. Branden Robinson wrote:
> > Please consider reverting the following recent changes to the bash
> > man page.  Bjarni should have run them by the groff list first,
> > because some of them are ill-considered.
> 
> OK. I'm trying to understand them myself; please take my comments in
> that spirit.

No worries.  My concern with some of the changes is that they risk
mystifying people who encounter them ("`kern`?  `ss`?  `lg`?  What are
those?") without delivering concrete benefit, typographical or
otherwise.  groff_man_style(7), as of groff 1.23.0, attempts to document
all of the *roff syntax a man page author is ever likely to need, and
strives _not_ to introduce any other *roff features or typesetting
concepts.[0]

> > +.\" suggested by Bjarni Ingi Gislason <bjarniig@simnet.is>
> > +.if n \{\
> > +.kern 0
> > +.ss 12 0
> > +.\}
> > 
> > The above change is half pointless and half intrusive.
> > 
> > A) No formatter for terminal output devices ("nroff mode", which is
> >     tested by "if n" performs kerning.  So that's a no-op.
> > 
> > B) The amount of intersentence spacing, for man pages, is matter of
> >    the _reader's_ taste and should be left to them.  mandoc(1)
> >    ignores this request and I'm glad it does.  So that, too, is a
> >    no-op with that formatter.
> 
> Is his intent here to force French spacing instead of English spacing?

Yes, if you understand "French spacing" to mean "the space between
sentences is the same as the space between words".  Frustratingly,
"French spacing" has multiple incompatible meanings.[1]

> How does groff deal with input where the number of spaces after a
> period varies?

roff(7) and the groff Texinfo manual cover this--clearly, I hope.  If
not, blame me because the language is mine, and I'll try to improve it.

(groff 1.23.0; UTF-8 follows)

       A roff formatter attempts to detect boundaries between sentences,
       and supplies additional inter‐sentence space between them.  It
       flags certain characters (normally “!”, “?”, and “.”) as
       potentially ending a sentence.  When the formatter encounters one
       of these end‐of‐sentence characters at the end of an input line,
       or one of them is followed by two (unescaped) spaces on the same
       input line, it appends an inter‐word space followed by an inter‐
       sentence space in the output.  The dummy character escape
       sequence \& can be used after an end‐of‐sentence character to
       defeat end‐of‐sentence detection on a per‐instance basis.
       Normally, the occurrence of a visible non‐end‐of‐sentence
       character (as opposed to a space or tab) immediately after an
       end‐of‐sentence character cancels detection of the end of a
       sentence.  However, several characters are treated transparently
       after the occurrence of an end‐of‐sentence character.  That is, a
       roff does not cancel end‐of‐sentence detection when it processes
       them.  This is because such characters are often used as footnote
       markers or to close quotations and parentheticals.  The default
       set is ", ', ), ], *, \[dg], \[dd], \[rq], and \[cq].  The last
       four are examples of special characters, escape sequences whose
       purpose is to obtain glyphs that are not easily typed at the
       keyboard, or which have special meaning to the formatter (like
       \).

That reads a bit better with font style changes, so "man 7 roff" might
be preferable.

> My personal writing style has changed from two spaces to one over a
> number of years, and the man page reflects that.

For _input_, it's a good idea to either break lines at the ends of
sentences, or put two spaces after them.  This is so that the formatter
knows where the ends of the sentences are.  Like TeX, *roff is not smart
to know where the sentence boundary/ies are in "C. A. R. Hoare next came
to the U.S. Linux kernel developers have yet to absorb his lessons."

For output, the amount of inter-sentence space is configurable; that
is what the `ss` request does.[2]  For man pages, I strongly urge all
authors to leave the issue alone so as to respect readers' preferences.
Since authors' will differ, this is the only way to achieve
consistency.[3]

People can get pretty passionate about this, and complain of their
eyeballs being violated when the "wrong" amount of inter-sentence space
is employed in a document they're reading.  Some people bring this
passion even to man page _source_ documents, and the only recourse in
that event is to break input lines at the ends of sentences.  This has
also been Brian Kernighan's advice to troff users since the 1970s.[4]
Linux man-pages maintainer Alejandro Colomar calls this practice
"semantic newlines".  My opinion is that it is a Solomonic solution,
satisfying neither partisan camp, but also has a benefit of reducing the
amount of churn in diffs.  Incremental changes to documentation often
find boundaries at sentences.

> > This change is pointless because no ligatures are defined for any of
> > the letter pairs in the text in any known formatter (the ligature
> > for "ct", like that for "st" [not seen here] is archaic in English
> > typography and seldom seen in digital fonts).
> 
> I assume he was interested in what formatters do with the `fi'. I
> couldn't see any discernable difference myself.

Right.  It will make no difference (1) when formatting for terminals;
(2) when formatting for a typesetter that doesn't support ligatures, or
when a font lacking them is used (Courier is a good example); or (3)
when copy-and-pasting from PDF to a shell prompt.  PDF has a feature--
which groff's gropdf(1) exercises--called "CMap" that decomposes
ligatures to their constituent letters when copied to the system
clipboard or other selection buffer.  (I assume the feature exists for
exactly this reason.)

Thus, in my opinion, that change was a lot of rigmarole for nothing.

> > Authorities differ on whether space should surround em dashes; from
> > what I have seen, a majority favor omitting them, and that is what I
> > do in the groff man pages, but I cannot say it is more than a matter
> > of taste.
> 
> I think it's cleaner with spaces, but it's clearly personal taste.

Sure.  Closing up spaces around em-dashes isn't quite _my_ preference,
either, in part because it can make *roff input a little uglier in some
edge cases (usually involving font alternation macros), prompting use of
the much-feared and mysterious `\c` escape sequence (or simple
resignation to subpar formatting, with the usual follow-up threats to
switch to Docbook or Markdown or whatever).

It has now been years since `\c` surprised me.  I _think_ I have it
documented adequately in groff 1.23.0.  I trust that someone will tell
me if I don't.

When I last surveyed the issue, the balance of authorities seemed to
disfavor spacing around em-dashes, I wanted consistency in the groff man
pages, and there were too many other issues where I sought revision or
reform and had more appetite for argument.  I'm an incorrigible
windmill-tilter, but I only have so many lances, you see...

Regards,
Branden

[0] I do perceive some gaps, like the absence of macros for "keeps" and
    for quotation; the latter is unreasonably hard to achieve
    attractively and portably at the same time.  Maybe some of these
    gaps will be filled in groff 1.24.  In compensation, the `SB` macro,
    a Sun extension that many people seem to believe came from Bell
    Labs, is now documented as deprecated in groff Git; I believe I
    solved the mystery of its origin and motivation.  It is unnecessary
    in modern implementations.

[1] The following may provoke laughter, a headache, or both.

    
https://en.wikipedia.org/wiki/History_of_sentence_spacing#French_and_English_spacing

[2] The "Files" section of groff_man_style(7) illustrates how to tune
    this and other subjective parameters that man page authors
    misguidedly attempt to impose on others via their documents.  Note
    also the "Options" section.

[3] Not exactly true.  mandoc(1) takes a Henry Ford approach; you get
    the adjustment and hyphenation modes, inter-sentence spacing, and so
    forth that its maintainer thinks you should see, ignoring requests
    that would alter them.  If you don't like those defaults, tough![5]
    (This decision is understandable in context.)

[4] https://rhodesmill.org/brandon/2012/one-sentence-per-line/

[5] https://www.dourish.com/goodies/see-figure-1.html

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]