groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] hyphen, minus sign and hyphen-minus


From: Pali Rohár
Subject: Re: [groff] hyphen, minus sign and hyphen-minus
Date: Mon, 28 May 2018 15:16:53 +0200
User-agent: NeoMutt/20170113 (1.7.2)

On Monday 28 May 2018 02:48:09 Ingo Schwarze wrote:
> Hi Pali,
> 
> Pali Rohar wrote on Sun, May 27, 2018 at 11:52:44PM +0200:
> 
> > Now I looked deeply at man -Tps output and basically \- sequence is
> > written as character 0xAD (\255 in octet) into output postscript file.
> > Therefore it is SOFT HYPHEN (U+00AD),
> 
> No, that is not a "soft hyphen".  Glyph numbers in fonts used for
> PostScript output have nothing to do with Unicode code points.
> Look at the file font/devps/TR for examples:
> 
> PS name      TR#   Unicode
> -------      ---   -------
> asciicircum  0x00  U+005E
> asciitilde   0x01  U+007E
> Scaron       0x02  U+0053 U+030C
> Zcaron       0x03  U+005A U+030C
> scaron       0x04  U+0073 U+030C
> zcaron       0x05  U+007A U+030C
> Ydieresis    0x06  U+0059 U+0308
> trademark    0x07  U+2122
> quotesingle  0x08  U+0027
> Euro         0x09  U+20AC
> hyphen       0x2d  U+2010
> circumflex   0x5e  U+02C6
> quoteleft    0x60  U+2018
> tilde        0x7e  U+02DC
> bullet       0x83  U+2022
> florin       0x84  U+0192
> minus        0xad  U+2212
> 
> and so on and so forth, it's completely different all over the place.

I'm saying that I generated PostScript file via man -Tps and then looked
into generated PostScript file.

And in PostScript file on place where should command line switch
--something was F2(\255... or F2<ad... \255 is IIRC glyph encoded in
octets and <ad> in hex. 0255 and 0xAD are both decimal 173, so both
refers to same glyph.

Now I see that in that PostScript file is also attached encoding vector
def /ENC0 [ ... ] and on position 173 is name /minus. And according to
Adobe /minus name represent Unicode code points U+2212.

So you are right it is not soft-hyphen, I forgot to see at encoding
vector in result PostScript file.

And also answer my question why ps2pdf converter from generates PDF file
where for switches are used U+2212 code points. ps2pdf did it correctly
by looking into attached encoding vector /ENC0.

So problem is for sure in grodvi which generates that PS file with
attached encoding vector. Unicode's hyphe-minus has code point U+002D
and according to Adobe's glyphlist.txt, U+002D is assigned to glyph name
/hyphen.

So man -Tps (or grodvi) can be fixed. Just it is needed to generate
correct encoding vector and use proper glyph name /hyphen for \- when
generating from manpage.

> > so it is incorrect for command line switch.
> 
> It is not incorrect.  The TR font does not contain a glyph for
> hyphen-minus, so plain minus is used as a fallback.

In font/devps/TR file is this line in "charset" section:

\-      564,286 0       173     minus

Should not this be number 45 instead of 173?

-- 
Pali Rohár
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]