[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] Broken chars
From: |
Werner LEMBERG |
Subject: |
Re: [Groff] Broken chars |
Date: |
Sun, 20 Mar 2005 08:48:33 +0100 (CET) |
> I have observed that groff changes certain characters in manpages.
> [...]
>
> For example, in perlcheat.1, the $| is changed into '$' and a
> vertical bar if the locale is UTF-8.
>
> real: | (U+0x7C)
> output: │ (U+0x2502)
>
> This also to a number of other characters, including the backward
> apostrophe (accent grave) ` (0x60) which is transformed into ‘
> (U+0x2018). This is very bad for copy+paste, and if your screen
> font does not have all the UTF8 characters (especially the case on
> bare 80x25 tty1 terminal), it does not even show any apostrophe, but
> a block to indicate that 0x2018 is not available in this font.
>
> Is this a (big) bug in groff, or intention?
In
usr/[local/]share/groff/<version>/font/devutf8/R
you can see which output codes are used for which input characters.
Looking into perlcheat.1, you can find this (converted on my platform
with Pod::Man v 1.37):
.tr \(*W-|\(bv\*(Tr
The .tr request translates characters. In this particular case, it
translates `|' to `\(bv'. `bv' is equivalent to `braceex' in PS
output, and is by default mapped to U+23AA. I have no idea why you
get U+2502 instead. And I have no idea why Pod:Man uses `bv' at all.
Regarding the grave accent mapped to U+0x2018, here is the comment
from groff_char(7):
` the ISO Latin-1 `Grave Accent' (code 96) prints as <U+2018>,
a left single quotation mark; the original character can be
obtained with `\`'.
' the ISO Latin-1 `Apostrophe' (code 39) prints as <U+2019>, a
right single quotation mark; the original character can be
obtained with `\(aq'.
For typesetting this is the right choice, since those two character
are used this way normally, similar to TeX.
Distributions can overwrite this. For example, in my SuSE 9.1, I have
this in /usr/share/groff/site-tmac/tmac.andocdb:
.if '\*[.T]'utf8' \{\
. char \- \N'45'
. char - \N'45'
. char ' \N'39'
.\}
To summarize:
. Mapping `|' to the `bv' entity is strange. If you use a plain `|'
in a troff input file, you actually get a plain `|'! This looks
like a bug in Pod::Man.
. The ` and ' characters in groff input files always indicate left
and right single quotation marks. U+0060 and U+0027 can be
accessed as \` and \(aq. Ideally, this is fixed in Pod::Man too,
if you use a `verbatim' mode, by translating those characters
temporarily. Otherwise, as shown above, this can be changed in
the configuration file of the man macros.
Werner
PS: Why the heck is `perlcheat.man' and all other non-program man
pages of perl in man section 1?
- [Groff] Broken chars, Jan Engelhardt, 2005/03/19
- Re: [Groff] Broken chars,
Werner LEMBERG <=
- Re: [Groff] Broken chars, Jan Engelhardt, 2005/03/20
- Re: [Groff] Broken chars, Werner LEMBERG, 2005/03/22
- Re: [Groff] Broken chars, Werner LEMBERG, 2005/03/22
- [Groff] Re: Broken chars, Mike FABIAN, 2005/03/22
- [Groff] Re: Broken chars, Jan Engelhardt, 2005/03/22
- [Groff] Re: Broken chars, Mike FABIAN, 2005/03/22