bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encoding customization variable names


From: Gavin Smith
Subject: Re: Encoding customization variable names
Date: Fri, 22 Jul 2022 17:55:01 +0100

On Fri, Jul 22, 2022 at 03:07:59PM +0100, Gavin Smith wrote:
> I think it's better if variables don't have to be set in combination.
> I feel that we could design a better interface.  Here's my attempt...
> 
> Options to allow:
> * Use document encoding
> * Use locale encoding
> * Specify encoding explicitly

...

> I'm going to make a start by stripping out the LOCALE_ prefix and then
> have a look to see if something else is needed to give these variables
> priority (from the user's perspective).
> 

One problem with the current implementation as I see it, is that
DOC_ENCODING_FOR_INPUT_FILE_NAME controls the effect of
INPUT_FILE_NAME_ENCODING (formerly LOCALE_INPUT_FILE_NAME_ENCODING).
If DOC_ENCODING_FOR_INPUT_FILE_NAME is set to 1 then INPUT_FILE_NAME_ENCODING
has no effect, even if it has been set explicitly by the user on the
command line.

This is also confusing in texi2any.pl, where (LOCALE_)INPUT_FILE_NAME_ENCODING
is defined but is ineffectual in the default case.

As I understand it, the configuration is "finished" by the time the
parser starts, so it is not possible to set INPUT_FILE_NAME_ENCODING or
other config variables from the document.  Whatever the configuration
is needs to be in place before the document starts being parsed.

My current idea is to save the locale encoding, perhaps in a hidden or
undocumented customization variable.  In
Texinfo::Convert::Converter::encoded_input_file_name and similar functions,
the value of INPUT_FILE_NAME_ENCODING should always take priority over both
the locale encoding, and the document encoding.  If INPUT_FILE_NAME_ENCODING
is not given, then either the locale or document encoding should be used
according to the value of DOC_ENCODING_FOR_INPUT_FILE_NAME.  I think this
would be quite clear.

I had thought about automatically setting DOC_ENCODING_FOR_INPUT_FILE_NAME if
INPUT_FILE_NAME_ENCODING was given, but this would be unlike anything else
in the program's configuration and would lead to inconsistencies depending
on where the configuration came from (command line, defaults, init files...),
so is probably best avoided.  Likewise with altering the priority of
DOC_ENCODING_FOR_INPUT_FILE_NAME and INPUT_FILE_NAME_ENCODING depending on
how/where they were set.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]