bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gettext] C locale assumptions


From: Bodhi Creation
Subject: [bug-gettext] C locale assumptions
Date: Tue, 8 Jan 2019 13:31:33 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

Hello GNU gettext maintainers, I've recently found an "unwanted feature" with how gettext handles translations.

When a user requests several translations using the `LANGUAGE` environment variable, gettext will search through translations in the provided order.  If it cannot find a translation, it will fall back to the C locale identifier string.  This may seem like the obvious approach, but it often fails to work in a desirable manner.  More often than not, the identifiers are in a language for which a translation is not available.  In these cases, if the identifiers are in a preferred language, gettext will instead look for the next preferred translation.  It should provide the gettext identifiers themselves, which are in a more preferred language!

For example, if Alice is a native English speaker but also speaks French, she may wish to set `LANGUAGE="en:fr"`.  Because most software uses English gettext identifiers and does not provide a 1:1 translation for `en`, instead of displaying their messages in English the messages will display in French (if an `fr` translation is available).  This forces Alice to abandon the order of preference feature, and simply set it to `LANGUAGE="en"`.  This will work somewhat well for Alice, unless some piece of software does not use English or French identifiers, and has a translation for `fr` but none `en`.  Here, messages will not display in either of her two languages, even though a French translation is available.  Of course, Alice's case is fairly common, but there are more complicated and more harmful cases as well.  Like if Haruto's preference would be `LANGUAGE="ja:en:fr"`; so often messages would display neither in Japanese or English, but in French instead.

I see two solutions to this problem:

  1. Encourage gettext users to provide a 1:1 translation for all messages in the language of the identifiers.  By increasing documentation about this issue and this solution, hopefully, more developers will work to prevent it.  There might also be some way to warn users at compile time that they're missing a 1:1 translation.  Could a tool to make it easy to automatically generate a 1:1 translation be developed?
  2. Provide a way for gettext to know at compile time the locale that it should assume all the identifiers are in.  If a message language is requested for a translation that does not exist and that language is the assumed locale for the identifiers, stop searching and provide identifier itself.  This should be a variable-like macro that can be defined on the command line.  Additionally, a new compile-time warning should be emitted if an assumed locale is not specified.  This will allow users and distributions to fix the developers slip-up just by defining the macro in `CFLAGS`.

I don't see how either of these options breaks compatibility with existing applications, nor do they conflict with each other.  They also don't seem like they would need any major changesthoughI'm not familiar with the code base.  As such, I think gettext could move forward with both simultaneously.

Thanks for your time and your input.

 — Izzy


Isabell Cowan
Phone: 651-321-2161

reply via email to

[Prev in Thread] Current Thread [Next in Thread]