bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: format of #. reference


From: Bruno Haible
Subject: Re: format of #. reference
Date: Fri, 2 Apr 2010 20:45:04 +0100
User-agent: KMail/1.9.9

Hello,

Gabriel R. wrote in
<http://lists.gnu.org/archive/html/bug-gnu-utils/2010-03/msg00036.html>:

> This is feedback regarding the documentation at 
> http://www.gnu.org/software/gettext/manual/gettext.html#PO-Files
> 
> The format of the reference fields mentions that "Comment lines starting
> with #: contain references to the program's source code." However, .PO file
> are used for more than source code strings. I.e. Drupal's
> internationalization module exports POs with content using references in the
> format _module:id:place_. There are hundreds of translators, thousands of
> sites and millions of users benefiting from these POs.      
>
> It would be worth extending this definition to cover these cases.

This request makes no sense to me.

The purpose of the source references is to allow a a PO file editor to
show the textual context of some message to a translator, in order to
give the translator some hints about the message that the programmer
did not give. At least some version of KBabel actually does this.

The source references are defined as file names and line numbers. This way,
every PO file editors knows how to open the file and highlight a particular
line. (The PO file editor also needs to know where to find the source
package: some configuration is needed. But this is a different issue.)

If some POT file extractor program is unable to provide line numbers and
instead writes module:id:place, how is a PO file editor supposed to
highlight that location? Should it know about Drupal modules, and ids?
Then someone else will claim that PO file editors should also know
about VLC plug-ins and whatever way there is to structure source code
semantically. This makes no sense to me. A source code reference
should be given at the simplest possible syntactic level, and that
is a reference to a file and line (and possibly column).

On the Drupal site I did not find any example of PO files with the
syntax that you mention. But I find two other annoying and gratuitous
divergences:

   1) In <http://drupal.org/files/issues/modules-aef_image.pot> you have
      a file with mostly correct references, except that the first two
      have a line number of 0. This is still better than nothing, but
      can be improved: The referenced aef_image.info file is in
      
<http://git.drupalfr.org/cgi-bin/gitweb.cgi?p=contributions-new-date/aef_image.git;a=blob;f=aef_image.info;h=73b013769fe63fb7313215be020527d002599653;hb=HEAD>
      which clearly is a text file with line numbers. The second message
      is at line number 3, so why not give that?

   2) In the potx-7.x-1.x-dev.tar.gz from <http://drupal.org/project/potx>
      there is a file potx.pot with a message like this:

        #: potx.module:76;29
        msgid "Extract"
        msgstr ""

      This apparently denotes lines 76 and 29 in file potx.module. This is
      a gratuitous incompatibility to the PO file format. It should be
      written as two separate references, each with 1 line number:

        #: potx.module:76
        #: potx.module:29
        msgid "Extract"
        msgstr ""

There are cases when a line is simply unavailable, such as when converting
from GUI descriptions that are not usually viewed in textual form, or when
converting from a third-party format that only had file references (such as
RST). But Drupal appears not to be in this camp?

> The best PO editor, and OSS
> at that, is Qt Linguist. The brave guys developing it took your definition
> and examples a strict specification. They went as far as making the editor
> replace the end of the #. lines with :0 to resemble line numbers.

This is quite understandable. The gettext tools don't add :0 there, though.

> This effectively corrupts the files, making Linguist useless for Drupal
> internationalization.

Why are files that have been modified in comments considered "corrupt"?
As source code can be restructured (lines inserted in files, or files be
renamed), it is quite clear that source references are only an informative
part of a message. When matching two files, a tool should look at the msgids
and basically ignore the source references.

> I reported the bug on their open tracker, but they 
> pointed out the gettext documentation as the reason.

If you want me to jump into that discussion, can you provide a pointer?

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]