[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Po4a-dev]HTML module (first revision)
From: |
Martin Quinson |
Subject: |
Re: [Po4a-dev]HTML module (first revision) |
Date: |
Fri, 14 Feb 2003 16:31:54 +0100 |
User-agent: |
Mutt/1.5.3i |
On Thu, Feb 13, 2003 at 03:03:22PM +0100, Laurent Hausermann wrote:
> Hi all,
>
> I have developped an HTML module for po4a. It has still some bugs and it's
> not
> perfect, but I think it's a good starting point.
>
> It uses HTML::TokeParser ( apt-get install libhtml-parser-perl )
>
> I sent the whole diff to Martin Quinson, not to this list (I can send any you
> if you mind a email to me) .. ?
Ok, I commited this to the CVS, so that others can see it.
This module isn't ready to release yet in my opinion. Here are my objections:
* The parser you used don't allow to retrieve the line number. Why not to
use the HTML::Parser module, which seems somehow more powerfull ?
* The sentence:
a wonderful wife named "<a
href="mailto://Armelle.Quinson.fr">Armelle</a>",
and a marvelous little boy <a href="Tristan.html">Tristan</a>
(yup, it's part of my homepage ;) is changed to:
# type: td
#: FIXME:0
#, no-wrap
msgid "a wonderful wife named \""
msgstr ""
# type: a
#: FIXME:0
#, no-wrap
msgid "Armelle"
msgstr ""
# type: td
#: FIXME:0
#, no-wrap
msgid "\", and a marvelous little boy"
msgstr ""
# type: a
#: FIXME:0 FIXME:0
#, no-wrap
msgid "Tristan"
msgstr ""
That is to say that sentences are broken in subparts, which is BAD. (see
http://www.ens-lyon.fr/~mquinson/l10n.html for a rational).
* Your version don't put entry type in the po, which prevents from using
gettextization (see po4a(7) for more details). I quickly hacked a
support for that in the version in CVS, but that's not perfect yet.
I suggest that:
- you move to a parser that allows you to retrieve the line number (or
explain me that I'm an idiot and that this parser do allow you to
retrieve the line number, and how)
- you look at the sgml module to see how we handle the fact that some tags
delimit a paragraph (like <p>), and should be translated, and that some
other tags shouldn't be touched because they don't delimit a sentence
(like <b>, <i> and so on)
Sorry, but I really can't release this module as is...
Anyway, thanks for your contribution, it IS a good start.
Bye, Mt.
--
Source is provided to this software because we believe users have the right
to know exactly what a program is going to do before they run it.