[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] Some thoughts on glyphs
From: |
Alejandro López-Valencia |
Subject: |
Re: [Groff] Some thoughts on glyphs |
Date: |
Tue, 27 Aug 2002 07:35:35 -0500 |
----- Original Message -----
From: "Werner LEMBERG" <address@hidden>
To: <address@hidden>
Sent: Monday, August 26, 2002 5:32 AM
Subject: Re: [Groff] Some thoughts on glyphs
>
> Dear friends,
>
>
> in April I suggested to extend the \[...] escape to support composite
> glyphs.
>
> I reexamined my old letter and found some deficiencies, so here my
> new proposal. Please comment.
>
[snip]
[scheme explanation; seems consistent and parsimonious to me, no problems
with it. My mind boggles, but Unicode always makes me dizzy, anyway.]
> I've completely dropped the idea that groff does something like
> `\z\[ho]A' automatically if `\[A ho]' is not defined. Here a revised
> version how a latin2 input encoding could be implemented, assuming
> standard PS fonts:
>
> .\" The rather generic .composite calls could be in a file
> .\" `glyph.tmac' which is always loaded at start-up of groff.
> .
> .composite ho u0328
> .composite ah u030C
> .composite aa u0301
> ...
Should we strive to have all the Unicode ranges mapped in glyph.tmac, or
just the Latin-A, Latin-B and extensions (perhaps Cyrillic too) plus the
needed ranges for European/Slavic languages typesetting (math symbols,
dingbats, etc.) and leave the CJKV ranges as optional files to be
loaded on demand (you already said that complex in-context typesetting such
as in Arabic and most Hindi scripts is out of scope)? Perhaps make the
actual Unicode ranges loaded by default a runtime configuration flag with a
sensible default that can be changed with a "configure" variable before
compilation, like paper size and the postscript spooler flags?
That is, CJKV ranges are huge, they would slow down start up a lot, but I
believe they will become a necessary part of the system, see for example Jie
Zhang question today about doing Simplified Chinese typesetting, which takes
us, I think, to the triroff extensions I mentioned a long time ago.
What I like with your proposal, and the input encoding mapping mechanism you
propose, is that someone with enough determination could create input
encoding mappings as big as Big-5 to UTF-8 or a Shift-JIS to UTF-8 encoding
(would UTF-8 be the internal encoding?).
> .de latin2-tr
> . trin \\$1\\$1
> . if !c\\$2 \
> . if (\n[.$] == 3) \
> . char \\$2 \\$3
> . if !c\\$1 \
> . trin \\$1\\$2
> ..
> .
> .latin2-tr \[char161] "\[A ho]" "\o'A\[ho]'"
> .latin2-tr \[char162] \[ab]
> .latin2-tr \[char163] \[/L]
> .latin2-tr \[char164] \[Cs]
> .latin2-tr \[char165] "\[L ah]" "\o'L\[ah]'"
> .latin2-tr \[char166] "\[S aa]" "\o'L\[aa]'"
> ...
And talking about the input encoding to Unicode mappings... I think they
should be configurable at runtime with a default determined at compilation
time too. I see this as advantageous to people who actually use an input
encoding different to Latin-1 (I can think of most other ISO-8859 encodings,
Windows code pages, KOI-8, MacOS encodings under OSX, CJKV encodings, and
non standardized encodings such as Georgian and Armenian).