[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: pdfmark: need a method to sanitize text in document outlines
From: |
G. Branden Robinson |
Subject: |
Re: pdfmark: need a method to sanitize text in document outlines |
Date: |
Wed, 4 Aug 2021 16:37:45 +1000 |
User-agent: |
NeoMutt/20180716 |
Hi, Keith!
At 2021-08-02T16:30:47+0100, Keith Marshall wrote:
> I don't recall noticing this before, (probably an oversight on my
> part), but if I regenerate pdfmark.pdf today, from pdfmark.ms, I see
> several document outline entries similar to:
>
> The F[C]pdfmarkF[] Operator
>
> In this, the (unwanted) F[C] and F[] appear to be artefacts from the
> likes of:
>
> .NH 2
> .XN The \F[C]pdfmark\F[] Operator
>
> where XN is a locally defined macro which emits its entire argument
> list as the text for the numbered heading, while also constructing a
> table of contents entry, and a document outline entry, from the same
> arguments.
>
> Clearly, the formatting escape sequences need to be filtered out of (a
> copy of) the argument list, before passing it to the pdfbookmark
> macro.
[...]
I don't have a solution for you but I think I do have a similar problem.
I wanted to implement the missing SG ms(7) macro (available in V7 Unix
troff but not groff), but got stuck because authors are stored in a
diversion instead of being passed as macro arguments. That means they
can do all kinds of complicated things. So if you try to pop them again
for .SG purposes they'll come out _exactly_ as the diversion had
prepared them, complete with unhelpful,
possibly-redundant-in-their-original-context centering requests and so
forth.
Probably there's nothing that can be done about this in ms(7) for
historical reasons; I suppose the advice would be "don't do that with
your authors, then!" if you want to use the SG macro. (Or just don't
use SG, which seems to suit most groff ms users, given the paucity of
documented complaints about its absence.)
Maybe what we need is something stronger than 'asciify' (a regrettable
name in the age of Unicode)--maybe 'stringify': something that will
discard everything from a stored string or diversion that isn't an
ordinary or special character. But we'll still, I think, face a problem
our Texinfo manual warns about.
@code{asciify} cannot return all items in a diversion back to
their source equivalent; nodes such as those produced by the
@code{\N} escape will remain nodes, so the result cannot be
guaranteed to be a pure string. @xref{Copy Mode}.
(Hmm--that semicolon should be a colon.)
I don't feel I understand these issues completely, so I might be off
base.
Regards,
Branden
signature.asc
Description: PGP signature