[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] I18n flag for msgfmt
From: |
Bruno Haible |
Subject: |
Re: [PATCH] I18n flag for msgfmt |
Date: |
Wed, 7 Jan 2004 13:22:44 +0100 |
User-agent: |
KMail/1.5 |
Hello Behdad,
> Well, this "I" thing is something that should be used in message
> translations, not original messages themselves, for a few
> reasons:
>
> 1. There are currently tens of thousands of message strings out
> there that use "%d" or "%f" ...
>
> 2. If applications use "%Id" in source, then when I for example
> run them with my default LANG=fa_IR, but I don't have Persian
> translations for this specific application, I don't like to get
> Persian numerals, but original English ones. So "%Id" should not
> be used in source code, but in Persian translations.
>
> 3. The border between which numbers should be written with local
> digits, which with latin digits, is not quite clear. For example
> in Persian we write every number with Persian digits, but I can
> see how we may write a price with US dollar currency sign with
> Latin digits. Or Arab people may have their own desires about
> which numbers they would like to see in their local digits, which
> not. So the decision better be left to each translation team,
> instead of the (usually) western developer which all this "I"
> thing is a non-issue for him.
Thanks for explaining. I wasn't aware of all this.
> So my solution:
>
> 1. msgfmt does not err about "I" flag, as it's a valid (and
> should be encouraged) usage.
>
> 2. gettext library, silently remove the "I" flag if the
> underlying libc does not support that. It may not be the best
> idea in the world, but having gettext already handling things
> like "<PRId64>", I find it the suitable place to handle the "I"
> flag, which is both necessary and sufficient for solving local
> digits in software translation.
You're right. I'll employ this solution in the next gettext release,
with only a minor modification: The translators will write
"%<OUTDIGITS>d"
instead of
"%Id"
This is because
- The %I flag is not backed by a standard. If some future C standard
reserves %I for a different purpose, glibc would have to change,
and msgfmt would have to emit a warning, and all Persian .po files
would have to be changed... (This already happened with %L a few
years ago.)
- "<OUTDIGITS>" does not mean "I". It means "I" or "", depending on
the system.
- There might be more places where the same technique is needed,
and the <...> syntax is somewhat consistent.
> Note: As a consequence, software developers should call gettext
> on all of their singleton "%d" format strings too. Perhaps, with
> some context, to let the translators decide on if it needs the
> "I" flag or not.
We have a problem with the context. First, we should make KBabel and
similar translation tools show the source references. Then only it
will make sense to mark all "%d" strings with _().
> PS. Got a question now: The current locale model is missing a
> few features that we need in Persian locale. One I can recall is
> decimal separator (and thousands separator?) for local digits is
> different from ones for Latin digits. In the current model,
> there is one and only one decimal separator per locale, so we
> have set it to the Arabic Decimal Separator charactor, but it
> means that we get Latin digits with Arabic decimal separator
> which is not the way people write Latin float numbers in Iran.
> The question is where should we start the fix? Bottom-up, adding
> support in glibc first, or no way, we should go the hard way,
> add it in POSIX first, glic later?
I'm not aware that a new POSIX would be in preparation currently.
Therefore I'd recommend to report the issue to libc-alpha at sources
dot redhat dot com. The people who can implement this are listening
there (most likely Ulrich, Petter Reinholdtsen, or me - it would involve
glibc/locale/ and glibc/stdio-common/vfprintf.c).
Bruno