[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: different encodings for input and output file names and command line
From: |
Gavin Smith |
Subject: |
Re: different encodings for input and output file names and command line |
Date: |
Mon, 28 Feb 2022 16:30:04 +0000 |
On Mon, Feb 28, 2022 at 10:29:02AM +0100, Patrice Dumas wrote:
> Hello,
>
> First, some tests with locales different from the @documentencoding made
> me realize that it would make sense to have a different encoding of
> file names for output than for input. Indeed, it may make sense for the
> input files names (@include, @verbatiminclude) to match the document encoding
> when extracted from an archive, for example. But even in that case
> encoding the output names using the @documentencoding is very dubious,
> especially since we often use a different output encoding for the file
> content than the @documentencoding.
>
> Another use case stirs even more in this direction. In the init files
> (latex2html, tex4ht, highlight syntax), there are some commands launched
> from texi2any. It seems to me natural and less error prone to encode
> those command lines in the locales encoding. But this also forces the
> file names referenced in those command lines to be in the locales
> encoding.
>
> In any case, handling all the different situations makes it necessary
> to be able to disconnect the input file names encoding from the output
> file names encodings, consistently with what we do with documents
> encodings. I will do the corresponding code anyway and allow both the
> locale to be used and the @documentencoding in both cases based on
> customization variables switches. The question that remains is what to
> use for the defaults and for the command lines.
>
> My proposal is:
>
> * input file name encoding:
> My preference would be the locale, but Gavin proposal to use
> @documentencoding also has merit, so let stick to @documentencoding
> except on Windows where the locale is used.
> * output file encoding:
> Use the locale in the default case.
> * command lines called from texi2any
> Always use the encoding already used for messages defaulting to the
> locale encoding
>
> Opinions, ideas?
Yes, I agree with all of this. For creating files it would make sense for
the locale encoding to be used or for UTF-8 where the output format requires
this (as you said in your other mail).