bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

recode ISO-8859-11..UTF-8


From: Kevin Rodgers
Subject: recode ISO-8859-11..UTF-8
Date: Fri, 18 Jun 2004 11:32:16 -0600
User-agent: Mozilla/5.0 (X11; U; SunOS i86pc; en-US; rv:0.9.4.1) Gecko/20020406 Netscape6/6.2.2

I made a small file containing just the upper 96 ISO-8859 characters
(each preceded by ASCII space, 16 to a line, for readability), and
converted it with recode --strict ISO-8859-$n..UTF-8 for n=1-11,13-16.

As expected, the --strict option reported untranslatable input for
ISO-8859-3, ISO-8859-6, ISO-8859-7, ISO-8859-8, and ISO-8859-11 because
of the unassigned code points in those character sets (see
http://en.wikipedia.org/wiki/ISO_8859).  When I replaced the unassigned
characters for each character set with a space, recode --strict reported
no errors.

Now I'm viewing the resulting UTF-8 output files with Emacs 21.3
(invoked as emacs -q -fn fontset-standard), and everything looks as
expected with the exception of the Thai file that was generated with
recode --strict ISO-8859-11..UTF-8.  For every single (non-whitespace)
character, `C-u C-x =' reports

    charset: latin-iso8859-1
             (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100)
   category: l:Latin

and of course the displayed glyphs don't look anything like those on the
wikipedia web page.

--
Kevin Rodgers





reply via email to

[Prev in Thread] Current Thread [Next in Thread]