bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #63074] [troff] support construction of arbitrary byte sequences in


From: Deri James
Subject: [bug #63074] [troff] support construction of arbitrary byte sequences in device control commands
Date: Tue, 9 Jan 2024 16:34:14 -0500 (EST)

Follow-up Comment #22, bug#63074 (group groff):

Whew, rather a lot to cover!

First the original "bug" was "fixed" by including -f U-T in the command.

Next it became a wish to include non-latin character in the bookmarks. This is
now working on my branch, waiting for Branden's integration.

Then it became a discussion on Branden's for iterator being used as a
replacement for stringhex, and using it to send arbitrary bytes in device
control commands, and his recent discovery that you can already do this. My
statement in 2022 (see comment #11):-

"If I dropped the .asciify from pdf.tmac it would mean all the \[uXXXX]
strings would reach the post processor gropdf, which could then assemble a
UTF-16 string from the hex numbers."

Which is exactly what I have done in the new pdf.tmac/gropdf.

I think Branden has not fully grasped the reason why stringhex is required.
The problem lies in the original pdfmark API, if you look at the pdfmark.pdf
you will see that in the sections describing .pdfhref M and .pdfhref L which
both refer to a "dest-name" and "descriptive text", it says that if a
dest-name is not given the first word in the description is used as the
dest-name.

The macros create a string like:-

.ds pdf:look(\\*[dest-name]) descriptive text

Since descriptive text can include any groff escape this means that dest-name
may also include any groff escape occurring in the first word. The reason it
creates these string registers is to support mom features such as:-

.HEADING 1 NAMED Гуляйпольщина "Гуляйпольщина"
Гуляйпольщина (укр. Гуляйпольщина) или
Махновщина, также Вольная
Территория — повстанческий район в
Северном Приазовье в период
Гражданской войны 1918—1921 гг.
.PP
And so it goes on.
.PDF_LINK Гуляйпольщина PREFIX ( SUFFIX ) "see: +"

Where the "+" is replaced by the contents of the string register
pdf:look(Гуляйпольщина), which would actually be a string of
\[uXXXX] nodes, so would generate an error. This is what stringhex is for, to
hide the contents so that groff does not see it as a sequence of nodes. The
ideal solution would be to allow string registers to have an attribute (say
"glass") which signals that groff should never try to interpret its contents,
i.e. operate as if the escape mechanism was turned off just for the contents
of that register, and have a way of turning that attribute on/off or an escape
which sets the attribute for the enclosed string.

I don't know if this is helpful, and helps you understand why stringhex is
being used.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?63074>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]