auctex-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUCTeX-devel] Why AUCTeX passes file via \input?


From: Ryszard Kubiak
Subject: Re: [AUCTeX-devel] Why AUCTeX passes file via \input?
Date: Mon, 26 May 2008 00:04:39 +0200
User-agent: Thunderbird 2.0.0.14 (X11/20080505)

Hi David,

I never understood the Polish preoccupation with translation files.  The
rest of the world uses the inputenc package which also happens to work
with verbatim environments.  If we were talking about plain TeX, this
would be somewhat different.

Here is why. I enclose two versions of a LaTeX document with Polish
diacritical characers. They are both encoded using ISO-Latin-2,
one of them uses the inputenc package while the other makes use of
a --translate-file=il2-pl.tcx option.

You may pdflatex them and see the results on your own. You will see
thate they both produce proper, equivalent PDFs, still they are not
equivalent when it comes to processing messages written from them
to output files, in particular the ones TeX places in its log file.

You may see that with the testInputenc.tex file a Latin-2 encoded text:

  Pójdź, kiń-że tę chmurność w głąb flaszy.

gets converted to LaTeX standard:

  P\'ojd\'z, ki\'n-\.ze t\k e chmurno\'s\'c w g\IeC {\l }\k ab flaszy.

There are problems with automatic postprocessing of such output.
Imagine you build an index and you want to sort it alphabetically.
The output encoded using LaTeX convention enforces writing
a special sorting program; the standard sorting programs are useless
as they don't know such a fancy encoding. Not to mention of
the bare readability of such a text, being so far from the diacritics
we write and want to see them in the output.

The situaton gets worse inside TeX's log file where even the LaTeX
encoding is not honoured, getting reconverted to something like:

Underfull \hbox (badness 10000) in paragraph at lines 9--9
[] \OT4/cmr/m/n/12 Pójdš, kiŤ-ťe tŚ chmur-noą˘ w gŞĄb fla-szy.

The reason for introducing the --translate-file option was that
the input encoding becomes transparent at all the levels
of processing the input (you may inspect the results of
processing the testTCX file). It's important in not only in
professional applications. For non-professional TeX users log
files are far more readable when the refer to the input code in
its original form and not to something double-encoded using
misterious tricks.

The usefulness of --translate-files has been thoroughly
discussed within TeX community. As it was said many times,
one CANNOT achieve the transparency with respect to encoding
without changing TeX's engine. The --translate-file option does
change it, LaTeX's conventions do not.

Regards,
Rysiek





\documentclass[12pt]{article}
\usepackage[latin2]{inputenc}
\usepackage{polski}
\begin{document}
Pójd¼, kiñ-¿e tê chmurno¶æ w g³±b flaszy.
\openout0="testInputenc.out"
\write0{Pójd¼, kiñ-¿e tê chmurno¶æ w g³±b flaszy.}

\showhyphens{P\'ojd\'z, ki\'n-\.ze t\k e chmurno\'s\'c w g\IeC {\l }\k ab 
flaszy.}
P\'ojd\'z, ki\'n-\.ze t\k e chmurno\'s\'c w g\IeC {\l }\k ab flaszy.
\end{document}
%& --translate-file=il2-pl
\documentclass[12pt]{article}
\usepackage{polski}
\begin{document}
Pójd¼, kiñ-¿e tê chmurno¶æ w g³±b flaszy.
\openout0="testTCX.out"
\write0{Pójd¼, kiñ-¿e tê chmurno¶æ w g³±b flaszy.}
\showhyphens{Pójd¼, kiñ-¿e tê chmurno¶æ w g³±b flaszy.}
\end{document}

reply via email to

[Prev in Thread] Current Thread [Next in Thread]