[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

glibc UCS4_to_BIG5-HKSCS mapping should contain compatibility points

From: Anthony Fok
Subject: glibc UCS4_to_BIG5-HKSCS mapping should contain compatibility points
Date: Sun, 17 Feb 2002 04:29:56 +0800
User-agent: Mutt/1.3.25i

Hello all,

 [Note: Oops, I forgot to attach the patch in my last mail.  :-)
        Here it is again.]

During recent testing with a Unicode HKSCS test document provided by the
Information Technology Services Department (ITSD) of the Hong Kong
Government, we discovered that iconvdata/big5hkscs.c did not contain
mappings for compatibility codepoints, which are important for backward
compatibility with GCCS (Government Common Character Set) and 
ISO 10646 v1.0 (ISO/IEC 10646-1:1993).  These codepoints are listed
in Annex III of the Hong Kong Supplementary Character Set (HKSCS) standard:

  Annex III  List of Compatibility Points and their Corresponding Characters
             in ISO 10646 v2.0 (ISO/IEC 10646-1:2000)

For example, ITSD's test document contains the simplified Chinese character
for "Horse": U+9A6C, and compatibility codepoint U+F404.  Both of these
should map to BIG5-HKSCS 0x89C6.  Of course, since U+F404 is only for
backward compatibility, when converting to Unicode, BIG5-HKSCS 0x89C6 should
only map to U+9A6C.  I.e., converting back and forth, we should have this:

   U+F404 -> B5+89C6 -> U+9A6C -> B5+89C6 -> U+9A6C -> B5+89C6... etc.

The attached patch adds these compatibility codepoints, in the
UCS4_to_BIG5-HKSCS direction only, of course.

In fact, these mappings from Unicode Private User Area (PUA)
U+E000..U+F848[1] to BIG5 End-User Defined Characters (EUDC) have been
listed in MS Code Page 950 for a long time, and are also specified in
the HKSCS standard (User-Defined Areas 1, 2, 3 and two Vendor-Defined
Areas) and how they are used when characters await inclusion into the
official Unicode / ISO 10646 standard.

[1] There is no mapping for U+F849..U+F8FF between BIG5-HKSCS and

Comments and discussions are welcome.

Best regards,

Anthony Fok
on behalf of ThizLinux Laboratory Ltd., Hong Kong

Anthony Fok Tung-Ling
ThizLinux Laboratory   <address@hidden> http://www.thizlinux.com/
Debian Chinese Project <address@hidden>       http://www.debian.org/intl/zh/
Come visit Our Lady of Victory Camp!           http://www.olvc.ab.ca/

Attachment: glibc-2.2.5-big5hkscs.patch.bz2
Description: Binary data

Attachment: pgpv9HUAtAoWQ.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]