groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BOM can ruin your happy groffing experience


From: Oliver Corff
Subject: Re: BOM can ruin your happy groffing experience
Date: Tue, 21 Nov 2023 17:33:55 +0100
User-agent: Mozilla Thunderbird

Hi David,

thank you for your insight!

You are right, currently I do not need Chinese in the translation, it is
all kept within .id .. blocks.

However, there may always be a case that I'll have to quote single
Chinese words or phrases in my translation.

Best regards,

Oliver.


On 21/11/2023 17:26, Dave Kemper wrote:
On 11/21/23, Oliver Corff <oliver.corff@email.de> wrote:
So the first line effectively was:

<feff>.ig

No wonder it did not work. Would it be meaningful to (optionally) tell
groff to jump over or throw away BOMs it encounters at the beginning of
a file? Or should sanity and awareness be left with the astute user?
The problem with this sensible idea is that groff input is ISO 8859-1
(a.k.a. Latin-1) encoding, and FE and FF are both valid Latin-1
characters (albeit ones unlikely to appear as the first two bytes of a
Latin-1 document).

Giving groff the -k option may act as an ersatz ignore-the-BOM option;
this will run the preconv preprocessor, which is BOM-aware, before
running groff itself.  But if your input is otherwise in Latin-1, this
won't work, because the BOM will make preconv decide the input is
UTF-8.  If your groff input is limited to ASCII, it'll be fine,
because in the ASCII range Latin-1 and UTF-8 look identical.  (The
Chinese characters being only inside .ig blocks, I'm presuming it
doesn't matter for your purposes how these are encoded when they hit
groff.)

If groff itself had a command-line option specifically to tell it to
skip a leading BOM, that would still require you to know the BOM was
there to know the option was needed, which wouldn't have saved you the
hassle of debugging your problem.  And once you know the BOM is there,
you can create an alias that runs a simple sed (e.g., sed
1s/^\\xFE\\xFF//) before running groff.

--
Dr. Oliver Corff
Wittelsbacherstr. 5A
10707 Berlin
GERMANY
Tel.: +49-30-85727260
mailto:oliver.corff@email.de




reply via email to

[Prev in Thread] Current Thread [Next in Thread]