bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: non-ASCII characters in Automake source files


From: Paul Eggert
Subject: Re: non-ASCII characters in Automake source files
Date: 22 May 2003 17:01:46 -0700
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3

[Bruno, the context here is the following messages:
 http://sources.redhat.com/ml/bug-automake/2003/msg00203.html
 http://sources.redhat.com/ml/bug-automake/2003/msg00204.html
]

Alexandre Duret-Lutz <address@hidden> writes:

> If latin-1 characters in Automake sources ever turn out to cause
> problems, I'd rather convert them to unicode as Bruno did in Gettext.

I agree that UTF-8 is the way to go one of these days -- I assume
that's what you meant by "unicode".  But I'm not sure we're there yet.
For one thing, the latest stable version of Emacs (21.3) still doesn't
do the right thing out-of-the-box for UTF-8, at least not on my
platform (Debian GNU/Linux 3.0r1, which is the latest stable Debian
release).  So UTF-8 is still not convenient for developers like me.

gettext is a good package to convert to UTF-8 first, for obvious
reasons.  I looked at gettext-0.12 for ideas about how things should
be done in this area, and I found that it has some files in UTF-8 and
others in Latin-1.  I don't know how to distinguish between the two
encodings reliably and in general, and I don't know how Bruno edits
them.  Or perhaps he is converting them from Latin-1 to UTF-8 as he
runs across them?

I attempted to search for Latin-1 systematically in gettext-0.12, and
found the following files.

config/elisp-comp (this will be fixed by the change you just made to Automake)
NEWS
gettext-runtime/intl/locale.alias
gettext-runtime/man/help2man
gettext-tools/man/help2man
gettext-tools/misc/po-mode.el
gettext-tools/misc/po-compat.el
gettext-tools/misc/gettext.perl
gettext-tools/src/x-java.l
gettext-tools/projects/GNOME/teams.html
gettext-tools/tests/lang-*

gettext-tools/src/x-java.l's use of Latin-1 is a typo: it is a narrow
space in line 180 that "flex" interprets as part of the regular
expression, which is not what was intended.  This sort of thing is a
downside to allowing non-ASCII in source code.

Bruno has more experience with UTF-8 in source code, since he's
converted most of gettext.  I'll CC: this message to bug-gnu-gettext,
in case he has suggestions for making it easier to include non-ASCII
characters in source files.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]