[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
glibc UCS4_to_BIG5-HKSCS mapping should contain compatibility points
From: |
Anthony Fok |
Subject: |
glibc UCS4_to_BIG5-HKSCS mapping should contain compatibility points |
Date: |
Sun, 17 Feb 2002 04:26:07 +0800 |
User-agent: |
Mutt/1.3.25i |
Hello all,
During recent testing with a Unicode HKSCS test document provided by the
Information Technology Services Department (ITSD) of the Hong Kong
Government, we discovered that iconvdata/big5hkscs.c did not contain
mappings for compatibility codepoints, which are important for backward
compatibility with GCCS (Government Common Character Set) and
ISO 10646 v1.0 (ISO/IEC 10646-1:1993). These codepoints are listed
in Annex III of the Hong Kong Supplementary Character Set (HKSCS) standard:
Annex III List of Compatibility Points and their Corresponding Characters
in ISO 10646 v2.0 (ISO/IEC 10646-1:2000)
For example, ITSD's test document contains the simplified Chinese character
for "Horse": U+9A6C, and compatibility codepoint U+F404. Both of these
should map to BIG5-HKSCS 0x89C6. Of course, since U+F404 is only for
backward compatibility, when converting to Unicode, BIG5-HKSCS 0x89C6 should
only map to U+9A6C. I.e., converting back and forth, we should have this:
U+F404 -> B5+89C6 -> U+9A6C -> B5+89C6 -> U+9A6C -> B5+89C6... etc.
The attached patch adds these compatibility codepoints, in the
UCS4_to_BIG5-HKSCS direction only, of course.
In fact, these mappings from Unicode Private User Area (PUA)
U+E000..U+F848[1] to BIG5 End-User Defined Characters (EUDC) have been
listed in MS Code Page 950 for a long time, and are also specified in
the HKSCS standard (User-Defined Areas 1, 2, 3 and two Vendor-Defined
Areas) and how they are used when characters await inclusion into the
official Unicode / ISO 10646 standard.
[1] There is no mapping for U+F849..U+F8FF between BIG5-HKSCS and
Unicode.
Comments and discussions are welcome.
Best regards,
Anthony Fok
on behalf of ThizLinux Laboratory Ltd., Hong Kong
--
Anthony Fok Tung-Ling
ThizLinux Laboratory <address@hidden> http://www.thizlinux.com/
Debian Chinese Project <address@hidden> http://www.debian.org/intl/zh/
Come visit Our Lady of Victory Camp! http://www.olvc.ab.ca/
pgpHMH5iGHuhx.pgp
Description: PGP signature
- glibc UCS4_to_BIG5-HKSCS mapping should contain compatibility points,
Anthony Fok <=