bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #45502] [troff] .if, .ie, .el parsing incompatible with Unix V7, DW


From: G. Branden Robinson
Subject: [bug #45502] [troff] .if, .ie, .el parsing incompatible with Unix V7, DWB, and Heirloom Doctools troff
Date: Fri, 5 Apr 2024 19:24:17 -0400 (EDT)

Follow-up Comment #16, bug #45502 (group groff):

[comment #15 comment #15:]
> [comment #12 comment #12:]
> > Wow, it's actually GNU _troff_ that aggressively reads through
> > the newline.
> 
> That _was_ Carsten's original complaint.

Sometimes I don't evaluate the truth value of a proposition until I've
inspected the machine that interprets it.  😅

> I reiterate my question of comment #2: "Does strictly enforcing the V7 Unix
troff syntax offer any compatibility benefit?  That is, are there correctly
formed historical constructions that would be parsed incorrectly under groff
as a result of this change?"

We obviously have several *models* thereof in this ticket's history.

Whether/how those correspond to *roff documents written in anger in the past
50 years, I'm sorry I cannot say.
 
> I realize this would be parsed _differently_:
> 
> .if 0
> A
> 
> But you have to squint pretty hard to see this as a "correctly formed
historical construction": although AT&T troff _allowed_ an empty .if
predicate,

There is a terminological hazard here.  You are using "predicate" in the
sentential sense; I am using it in the (formally) logical sense.

What is empty in your example above, in the terminology I am employing, is the
"branch".  And, strictly, even that is not empty under AT&T _troff_ grammar as
implemented.  The branch, which is never taken due to the false _predicate_,
is a bare newline.

But, yes, the interpretation of the above differs between AT&T and GNU
_troff_s to date.

> CSTR#54 section 16 does not specify this as legal syntax,

It does say:


Any spaces between the condition and the beginning of anything are skipped
over.


Without too much of a head tilt, I can interpret this as implying that such
spaces can be omitted in the first place.  The traditional parsing then shakes
out.

> and it has no practical application.

I disagree here too.  It's possible someone might want to conditionally apply
a word break.


.nr AP 1 \" use AP Style Guide recommendations
.\" ...
Be sure to check your voice\c
.if \n(AP
mail.


There are indeed more straightforward ways to skin that particular cat.  But
the foregoing is not insane as I read CSTR #54.
 
> If this ever appears in any code intended for AT&T troff, it's probably the
result of a coder who began writing a conditional then got distracted by a
squirrel.

Probably.  I think in practice what GNU _troff_ users have likely done is add
brace escape sequences until (1) they got the desired output and (2) the
formatter quit mewling at them with "unbalanced `el` request" diagnostics.

> In the GNU age, on the other hand, coders might have written the above
deliberately, noticing that it worked despite not being strictly documented. 
And it's worked for at least two decades, and possibly all the way back to the
Clarkian era.
> 
> So it seems to me this proposal breaks back GNU compatibility to achieve
fealty to an AT&T construction that offered no real-world application.  Might
a better solution be to document the difference as a GNU syntactical
extension?

Now that I've gone through the logic and measured the behavior of multiple
implementations, I think GNU _troff_'s historical behavior here is an ugly and
_unintended_ syntactical extension.

I would point you to the following source comment from Clark.

https://git.savannah.gnu.org/cgit/groff.git/tree/troff/input.c?h=1.02#n3746

Since he intended to admit the input


#if 0\{


(observe the absence of a trailing backslash escaping the newline)

I find it implausible that he didn't mean to handle the same input _without_
the brace escape sequence.

He simply didn't write enough test cases for his formatter, saith I.

I think matching AT&T _troff_ behavior here produces a more consistent
grammar.  Yes, it will be less comfortable for people who read all languages
in the expectation that they are C.

The branch portion of a control flow request in *roff is read _as if it were
on an input line by itself_.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?45502>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]