[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Groff] Unicode, EBCDIC, Latin-2, JIS for groff
From: |
Werner LEMBERG |
Subject: |
[Groff] Unicode, EBCDIC, Latin-2, JIS for groff |
Date: |
Fri, 10 Mar 2000 18:39:00 GMT |
It's amazing to see that people are interested in having Unicode
resp. EBCDIC input within gtroff.
Other people want Latin-2, others again want Japanese...
How to handle this best?
My suggestion is to enlarge gtroff so that it can handle arbitrary
31bit characters (this covers ISO 10646). Characters with the 32nd
bit set (i.e. negative numbers) can then be used for special gtroff
`characters' like `ESCAPE_c'.
It should use Unicode (resp. ISO 10646) as the internal encoding and
nothing else.
Question: How far is the project of Unicode input?
Additionally, I suggest to use UTF8 exclusively as the external
encoding representation if, say, the command line option `-u' is used.
Groff should then come with a character set conversion tool (as a
preprocessor; maybe with heuristics to recognize the proper encoding?)
to map everything to Unicode in UTF8 representation (e.g. Latin-2, JIS
-- EBCDIC charsets also).
On the output side, I think that no essential changes are necessary
(except better support for very large fonts since gtroff's font handling
mechanism isn't very efficient here). Of course, grops e.g. should be
extended to support CID-keyed PS fonts.
Comments please.
Werner
- [Groff] Unicode, EBCDIC, Latin-2, JIS for groff,
Werner LEMBERG <=