groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG] italics run past where they should


From: G. Branden Robinson
Subject: Re: [BUG] italics run past where they should
Date: Thu, 21 Jul 2022 04:25:09 -0500

Hi Alex,

At 2022-07-20T16:58:34+0200, Alejandro Colomar wrote:
> I'm not sure if this is a groff(1) bug, or less(1), or who knows...

From your description I suspect a bug either in less(1) or your terminal
emulator.

> I've seen it sporadically, but when I tried to reproduce it, I didn't
> remember how I had triggered it, so I couldn't report it.  Now I can
> consistently reproduce it.

I _can't_ reproduce it.  I am using less 581.2 and XTerm #370 from
Debian bullseye.

> I'd expect the issue to be in less(1), because of how I trigger it,
> but it's weird, because I can't reproduce it with mandoc and less, so
> I attribute it to groff(1) for the moment.

Something to keep in mind is that grotty(1) (by default[1]) and
mandoc(1) take different approaches to terminal capabilities.  The
latter's maintainer, Ingo Schwarze, has on this mailing list declaimed a
distaste for ISO 6429 (a.k.a. ECMA-48) escape sequences, so mandoc(1)
produces bold and italics (actually, underlining) by overstriking, i.e.,
including backspace literals in its output.  VT100-ish terminal
emulators honor these but don't _interpret_ them--they do what is
commanded quite literally, destructively backspacing and replacing
character cell contents, with the result that neither bold nor
underlined characters appear as such.  So this styling information
disappears.

The less(1) program interprets these sequences _and translates them into
ECMA-48 escape sequences_, recovering the "graphic renditions" from the
input stream, as the standard would put it.

less(1) also, however, _refuses_ by default to interpret those same
escape sequences, which it happily produces, when they occur on its
input stream.  Some people, like Ingo, claim this to be advantageous for
security reasons; I don't know if Mark Nudelmann himself does.  I am
dubious.

Because people almost always view man pages on the terminal via a pager
program, they come to confuse terminal capabilities with pager
capabilities, and thus they quite wrongly insist that grotty(1) is
incorrect to emit ECMA-48 escape sequence even though every (practically
every?) terminal emulator in use on *nix systems honors the basic subset
of them that encodes renditions for bold and underlining.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=312935

This observation withstands even the fact that some terminal emulators
can't _render_ such styles; the Linux console driver on my system, for
example, won't show you bold, underscored, or bold underscored text, but
clearly _recognizes_ them because it renders each in a different color.)

> So, to trigger the bug, do the following potion:
> 
> # I reproduced it in clone(2), installed from my git tree,
> # but I also reproduced it with the clone(2) from my system,
> # which hapens to be manpages-dev 5.13-1, so it should be easily
> # reproducible.
> 
> $ man clone
> 
> # Now have a look at the synopsis.
> # You'll notice (or actually not notice) no weird formatting,
> # because there isn't
> 
> # Then, within less(1), search for 'flags' with /flags
> 
> /flags
> 
> # This should change the underscoring of some words after the match.

groff does not re-render the page because you did a search.  groff and
grotty have exited (or blocked waiting to write to a pipe) by the time
the pager runs.

So my suspicion is that some state has gotten desynchronized in less's
idea of the screen contents, or in your terminal emulator's.

> # If you close and open again the man page,
> # you'll see the good formatting again.

This is consistent with my hypothesis.

> # If I run the following command, then I can't reproduce it,
> # which is why I suspect that it's a problem in groff(1).
> 
> $ man -w clone | xargs mandoc | less
> /flags

This, too, is consistent with my hypothesis.  To try to verify it, you
might re-render the page using groff (via man(1) is fine) with
GROFF_NO_SGR=1 in your environment.

If doing so makes the bug similarly go away for groff, then you know
that the problem is with the generation of ECMA-48 escape sequences by
grotty(1), or with the maintenance of screen buffer state by less(1) or
your terminal emulator.

I feel that the former is unlikely because I use grotty(1) many times a
day and it is quite simple in its production of ECMA-48 escape
sequences; it doesn't test the value of $TERM, terminal capabilities, or
anything like that.  It fires blindly, which can be criticized, but for
this scenario has the virtue of telling us that if it did have this sort
of problem, many other people would see many more defects in its output
constantly.

I do not rule out a defect in grotty's ECMA-48 sequence production; I
simply think such is an unlikely explanation for the problem you
describe.  If there is such a defect in grotty, it is more simply
established by examining a hex dump of the program's output.  ECMA-48
escape sequences are recondite but decipherable[2].  If there is such a
bug, I am strongly motivated to fix it.

Regards,
Branden

[1] Debian switches this default around and (IMO, confusingly) adds
    another environment variable for it, GROFF_SGR.  (When I
    corresponded with Gavin Smith of Texinfo about similar issues, my
    ignorance of this downstream change made him think me quite stupid
    :-O ).  I have my system configured to restore the upstream default.
    Among other things, doing so enables me to view man pages in the
    terminal with a true italic style (well, oblique at any rate) by
    passing grotty "-P -i".

[2] 
https://www.ecma-international.org/wp-content/uploads/ECMA-48_5th_edition_june_1991.pdf

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]