Re: Internationalize Emacs's messages (swahili)

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Internationalize Emacs's messages (swahili)

From:	Daniel Brooks
Subject:	Re: Internationalize Emacs's messages (swahili)
Date:	Sat, 26 Dec 2020 01:07:50 -0800
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Daniel Brooks <db48x@db48x.net>
>> Cc: Zhu Zihao <all_but_last@163.com>,  dimech@gmx.com,  abrochard@gmx.com,
>>   rms@gnu.org,  bugs@gnu.support,  emacs-devel@gnu.org
>> Date: Fri, 25 Dec 2020 18:03:21 -0800
>> 
>> My personal opinion is that gettext is too limited. It works for simple
>> things, but provides no help at all for complex things.
>
> That is in no way specific to Emacs, is it?

Absolutely.

>> I think that the most productive way to think about translation is that
>> each coherent message that we present to a user (whether it's via the
>> message function or not) should explicitly be the result of calling a
>> function written by the translator. gettext only allows the translator
>> to supply strings, so it falls down in complex situations.
>
> The advantage of the translation infrastructure based on gettext is
> that the translators don't have to be programmers, they only need to
> be experts in correct use of technical terminology in their
> languages.  Even with that significant advantage, it is hard to find
> translators for many languages.  Your suggestion would make that job
> much harder, with the net result that more messages for more programs
> will remain untranslated: a classic example where the best is a sworn
> enemy of the good.

Yes, the simplicity of gettext is a big point in its favor. In the
common case, Fluent is not much more complicated. Here's a file from the
en-US locale from Firefox:
https://searchfox.org/mozilla-central/source/browser/locales/en-US/browser/aboutDialog.ftl

Most of the complications here come from the html that is embedded
inside the localized text:

    update-failed-main = Update failed. <a 
data-l10n-name="failed-link-main">Download the latest version</a>

Another language might put different text before or after the link, so
the anchor tag has to be part of the localized text. However, the
Javascript that displays this text will add the href to the anchor
first.

> (Disclosure: I'm the team leader for translators to the Hebrew
> language, as part of the GNU Translation Project.  I'm talking from
> personal experience here.)
>

> This problem was solved in gettext long ago, and is being widely used
> in existing translations.  See the node "Plural Forms" in the GNU
> gettext manual.  Emacs has the ngettext function in preparation for
> the day when we will be able to have translatable message strings.

Yes, I am aware of ngettext, and I could have picked a different
example. Consider the example from projectfluent.org, where the output
should change based on the user's gender:

    shared-photos =
        {$userName} {$photoCount ->
            [one] added a new photo
           *[other] added {$photoCount} new photos
        } to {$userGender ->
            [male] his stream
            [female] her stream
           *[other] their stream
        }.

This one produces messages like "Anne⁩ ⁨added ⁨3⁩ new photos⁩ to ⁨her
stream⁩.", which vary based on the three inputs passed in. ngettext
handles plurals, but it doesn't generalize to any other type of
variation we might want. I can't think of any reason why Emacs would
care about gender, but maybe BBDB could.

Another example from fluentproject.org illustrates that individual
translations can add variations that are purely for their own use.

The English translation has "-sync-brand-name = Firefox Account", which
is just assigning static text to a variable which will use used
frequently.

The Italian translation changes it to this:

    -sync-brand-name = {$first ->
       *[uppercase] Account Firefox
        [lowercase] account Firefox
    }

which serves the same purpose but lets the translator put this text at
both the beginning and end of a sentence.

Meanwhile, the Polish translator has changed it to this:

    -sync-brand-name = {$case ->
       *[nominative] Konto Firefox
        [genitive] Konta Firefox
        [accusative] Kontem Firefox
    }

which lets them choose the correct declension when needed:

    sync-signedout-title = Zaloguj do {-sync-brand-name(case: "genitive")}

All three translations can do their own thing, without needing to ask
the UI implementer to change anything and without coordinating with each
other first. They can also add these features gradually, as they refine
the translation. For example, they can start with a form that partly
dodges the grammar like "New emails: 42" at first, then later refine it
to "Found 42 new emails" once they get better coverage.

>> I recommend taking a look at Project Fluent
>> <https://www.projectfluent.org/>. It's a free-software implementation of
>> exactly the system that I've described. Translators write functions in a
>> syntax that is similar in some ways to both Javascript and an ini file,
>> which could be easily compiled into Elisp. (It's the successor to the
>> l20n project, which you might also have heard of.)
>
> How many translated languages for how many programs does this project
> have?

The main one that I know of is Firefox, which by my count has 96
translations. (See
<https://www.mozilla.org/en-US/firefox/all/#product-desktop-release>.)

> Anyway, the hard problems in translating some of the Emacs UI are
> elsewhere, as can be seen from the discussions to which I pointed.  We
> need to solve those first, and only after that worry about the issues
> you mention (if they are real).

I think that a system like Fluent moves most of the problems into the
translations, where they are more tractable (because each translation
only has to solve it's own problems). Note that most of Firefox's
translations are maintained by voluteers. They don't even have to send
patches or commit files to version control; they use a web page to view
and edit the translation, as well as to preview the results live. The
same tools can be used for Emacs.

I will continue to peruse these previous threads that you've pointed
out, but I'm not aware of anything that would be harder than just going
through the code factoring out the text. There aren't any clever macros
that can help with that, just hard work.

db48x

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Internationalize Emacs's messages (swahili), (continued)

Prev by Date: Re: Internationalize Emacs's messages (swahili)
Next by Date: Re: master f45ce78 2/2: Explicitly specify svg base_uri using `:base-uri' image property
Previous by thread: Re: Internationalize Emacs's messages (swahili)
Next by thread: Re: Internationalize Emacs's messages (swahili)
Index(es):
- Date
- Thread