[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
On synopsis grammar (was: Spaces in synopses of commands)
From: |
G. Branden Robinson |
Subject: |
On synopsis grammar (was: Spaces in synopses of commands) |
Date: |
Mon, 31 Jul 2023 05:58:52 -0500 |
[adding groff list so that more people can argue with me, since I once
again found a soapbox to mount]
At 2023-07-30T18:14:53+0200, Alejandro Colomar wrote:
> On 2023-07-30 18:13, G. Branden Robinson wrote:
> > I think this is a matter of achieving an accurate and unambiguous
> > synopsis grammar.
>
> Thanks; that kind of objective reasoning is what I wanted. Would you
> mind stating it in the commit message for posterity? :-)
I think I'll add it to the explanation of the example synopsis in
groff_man_style(7), too. ;-)
While I'd love for synopsis grammar to be _fully_ unambiguous, one
unfortunate case did arise in discussion with mandoc maintainer Ingo
Schwarze on the groff mailing list in the past year or two.
Consider:
foocmd [-abort] file ...
Is this a command that takes up to 5 different options -a, -b, -o, -r,
-t, or a command that takes one option called "abort"?
A program in the BSD tradition might suggest one answer and a program in
the X11 tradition another. I assume that this is not a new observation,
and is why the GNU project introduced (or adopted from some
now-forgotten progenitor) the double-dash long-option-name convention.
While we could eliminate the ambiguity by insisting upon a practice of
setting each short option in its own set of optional-argument brackets,
that would come at a significant cost in visual clutter.
Consider the groff(1) command, already ornamented richly with options.
groff [-abcCeEgGijklNpRsStUVXzZ] [-d ctext] [-d string=text]
[-D fallback‐encoding] [-f font‐family] [-F font‐directory]
[-I inclusion‐directory] [-K input‐encoding] [-L spooler‐
argument] [-m macro‐package] [-M macro‐directory] [-n page‐
number] [-o page‐list] [-P postprocessor‐argument]
[-r cnumeric‐expression] [-r register=numeric‐expression]
[-T output‐device] [-w warning‐category] [-W warning‐category]
[file ...]
In a quest for zero ambiguity, we might say:
groff [-a] [-b] [-c] [-C] [-e] [-E] [-g] [-G] [-i] [-j] [-k] [-l]
[-N] [-p] [-R] [-s] [-S] [-t] [-U] [-V] [-X] [-z] [-Z]
[-d ctext] [-d string=text] [-D fallback‐encoding]
[-f font‐family] [-F font‐directory] [-I inclusion‐directory]
[-K input‐encoding] [-L spooler‐ argument] [-m macro‐package]
[-M macro‐directory] [-n page‐number] [-o page‐list]
[-P postprocessor‐argument] [-r cnumeric‐expression]
[-r register=numeric‐expression] [-T output‐device]
[-w warning‐category] [-W warning‐category] [file ...]
And with that done, we might as well lexicographically order all the
options.
groff [-a] [-b] [-c] [-C] [-d ctext] [-d string=text]
[-D fallback‐encoding] [-e] [-E] [-f font‐family]
[-F font‐directory] [-g] [-G] [-i] [-I inclusion‐directory]
[-j] [-k] [-K input‐encoding] [-l] [-L spooler‐argument]
[-m macro‐package] [-M macro‐directory] [-n page‐number] [-N]
[-o page‐list] [-p] [-P postprocessor‐argument]
[-r cnumeric‐expression] [-r register=numeric‐expression]
[-R] [-s] [-S] [-t] [-T output‐device] [-U] [-V]
[-w warning‐category] [-W warning‐category] [-X] [-z] [-Z]
[file ...]
...but that doesn't seem like an improvement to me. Options that don't
take arguments are typically of Boolean sense. (Occasionally, as with
some applications of '-v', they model an incrementation operation of
some kind.) "Argumentful" options require further decision-making from
the user and it thus seems useful, to me, to segregate the two
categories. Some traditions evolve for good reasons. :)
As an aside, one might wonder why the groff(1) page uses such long
metasyntactic variable names in 1.23.0 when it did not in 1.22.4. After
years of working on groff's ~60 man pages, I came to adopt a handful of
principles.
1. A command should always offer a usage message via '--help',
presenting a (plain text) synopsis much like the above.
2. That synopsis, and the one in the corresponding man page, should
match.
3. A _usage_ message should be _useful_.
$ foo --barblegarg
foo: error: unrecognized option 'barblegarg'
foo: usage: foo [options] [files]
is so un-useful as to be user-hostile. A programmer who writes this
should be frank about their contempt for the user and drop such
"usage advice" entirely.[1]
Consider the novice user of groff. They might wonder, "is lowercase
'm' the flag letter for the macro package name and '-M' the one to
add a macro search directory, or the other way around"? Output like
I presented for it above answers such a question.
4. A usage message should not dump an _explanation_ of all options. A
person accustomed to the Unix command line philosophy of "no news is
good news" will rightly be dismayed when a command invocation they
expect to perform some task quietly and return to the shell prompt
instead spews a gout of text to the terminal. If many options are
supported, and/or their explanations demand much space to present,
the _actual problem_ with the command can easily scroll away. Yes,
maybe everybody has terminals with scrollback buffer these days, but
it's still rude. When something has gone wrong, a user's immediate
response should not be to pound on the keyboard some more, but to
pause, take a breath if necessary, and gather useful information
from the screen. If our "helpful" command hasn't left the most
important information _on_ the screen, that's harder to do.
5. It's okay to present a lengthy usage message, with much detail, if
a user explicitly requests "--help". But because lengthy runs of
text can get out of sync, I prefer to maintain such things in one
place--the command's man page.
6. Ideally, you'd store things like metasyntactic variable names for
command-line options in a data structure inside the command's
sources, and a mechanism, possibly an environment variable or an
otherwise "maintainer mode" command-line option, would dump a
well-formed synopsis in man(7) format[2] using this information to
the standard output. As part of package build, one could then apply
this output to a templated man page document to produce the shipping
page.
I first had this idea something like 25 years ago and I'm sure many
other people have, too, it being such an obvious application of the
DRY principle. I can only guess that it didn't happen because
getopt_long() is a GNU thing; GNU people (okay, let's be precise:
GNU Emacs people), historically, have held man pages beneath
contempt; and nobody else had both the traction and desire to get it
done. (Engineers paid to work on or adjacent to the Linux kernel
seem always to have struggled, either with themselves or their
managers, to justify expending more than a minimal effort on
documentation of any sort. Thus did both sides of GNU/Linux's white
picket fence become brownfields.)
7. One place we _don't_ need information rich metasyntactic variable
names is where we're going to spend a lot of words explaining them
anyway. So after over-applying a principle of militant synchrony,
I found that "Options" sections of man pages[3] could get by
pedagogically just as well with short ones; they were easier to cope
with typographically as well, improving the regularity of formatting
(which is helpful to the reader, visually) and reducing the need for
*roff stunts in man page sources to achieve consistent indentation
in a series of tagged paragraphs.
Consider again groff(1) options. Here's the synopsis/usage message
again, abbreviated.
groff [-K input‐encoding] [-L spooler‐argument] [-T output‐device]
And here's the corresponding material from its man page's "Options"
section.
-K enc Set input encoding used by preconv(1) to enc; implies -k.
-L arg Pass arg to the print spooler program. If multiple args
are required, pass each with a separate -L option. groff
does not prefix an option dash to arg before passing it to
the spooler program.
-T dev Direct troff to format the input for the output device
dev. groff then calls an output driver to convert troff’s
output to a form appropriate for dev; see subsection “Out‐
put devices” below.
(I haven't forgotten that you prefer two spaces between a tag and
the body of a tagged paragraph. I agree that it would look better.
I still intend to add a tunable parameter for that [defaulting to
2n], probably around the same time I do so for the base paragraph
indentation amount.)
We don't need a metasyntactic variable name as long as your arm when
explaining fully in adjacent text what the parameter means. At the
same time, replacing all such names in the foregoing example with
just "x" would be laconic to excess.
The Unix-Haters' Handbook is due for a second edition, isn't it? ;-)
Regards,
Branden
[1] "Use the source, Luke," a.k.a. "see Figure 1."
[2] or some JSONic thing easily transformed into man(7) or another
desired format
[3] Full disclosure: mandoc maintainer and mdoc(7) advocate Ingo
Schwarze opposes the existence of "Options" sections in man pages.
https://lists.gnu.org/archive/html/groff/2018-11/msg00031.html
signature.asc
Description: PGP signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- On synopsis grammar (was: Spaces in synopses of commands),
G. Branden Robinson <=