bug-glibc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: non-ASCII characters in locale.alias file


From: Tomohiro KUBOTA
Subject: Re: non-ASCII characters in locale.alias file
Date: Wed, 23 Jan 2002 10:37:19 +0900
User-agent: Wanderlust/2.8.1 (Something) SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (UnebigoryÅmae) APEL/10.3 Emacs/20.7 (i386-debian-linux-gnu) MULE/4.1 (AOI)

Hi,

At 22 Jan 2002 16:06:47 -0800,
Ulrich Drepper wrote:

> > Well, this is the reason why we can safely remove 'fran?ais'
> > from locale.alias file.  If I can use LANG=french, all people in
> > the world can use LANG=french.
> 
> Nothing can be removed.  What exists might be in use.

Thus I proposed in the original message to write some instructions
in some document files how to add these two ISO-8859-1 locale names
to locale.alias file.  It is a compromise for people who _really_
need such discouraged LANG variables (for compatibility reason to
interoperate with some funny systems) and who _really_ knows well
about what they are doing.


> > You mean, uses should not edit /etc/locale.alias ?
> 
> Certainly not.  Due something similar to what RH does with the
> /etc/sysconfig/i18n and /etc/profile.d/lang.{sh,csh} files.  This is
> the only sane way to deal with it.

I don't know about these files.  Are these files to be used to
specify encoding of files in /etc and so on?


> > If you really think usage of ISO-8859-1 here is not a bad idea,
> 
> You misunderstood what I said.  The file contains byte sequences,
> separated by newlines and white spaces.  The encoding is unimportant.

If no encoding is assumed for a byte sequence, it is called "binary
data", not "text data".  Do you mean /etc/locale.alias is a binary
file?  (If so, it is natural that we cannot edit binary file using
text editors.)

Anyway, the 8bit characters are ISO-8859-1, even if you think it is
a mere byte sequence.  If it were a mere byte sequence, why it looks
like a human word?  The file strongly insists that itself is a text
file.


> In fact the file doesn't have one encoding.  Yes, this might lead to
> editing problems but that's exactly why the file should be used as is
> and not touched again.

In old days when European language speakers can neglect multibyte
people (or just they assumed they can neglect multibyte people),
usage of ISO-8859-1 for global purpose is not problematic.  However,
it is apparent that such usage is illegal today, because it conflicts
with CJK multibyte users and another multibyte encoding of UTF-8.

People must abolish a wrong idea that ISO-8859-1 is the World.
It is like an ancient person who thought his/her village was the
only existing world.

I think it is free for someone to develop an ISO-8859-1 local fork
version of GNU libc, like we multibyte language speakers did develop
many local patches to use multibyte encodigns.  However, even in
multibyte world, more and more people come to recognize such "local
patch" approach is an evil and we should take "i18n" approach.
On the other hand, the original version of GNU libc should be
internationalized and should not biased to ISO-8859-1.  Well-
trained developers (like GNU libc developers) should not do such
small-minded bias (like the ancient person), though there are many
developers today who don't think about people outside the ISO-8859-1
world.  Multibyte people are annoyed every day every day by such
people.  Please don't help increasing such developers by encouraging
using ISO-8859-1 in /etc files.


One compromize would be usage of some escape character (for example,
fran\xe7ais and bokm\xe5l in /etc/locale.alias), though this solves
only a part of the problems.  (It makes /etc/locale.alias file sane
but it cannot discourage new users to use these locale aliases.)

---
Tomohiro KUBOTA <address@hidden>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]