[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: simplifying configuration of encoded characters/entities output
From: |
Gavin Smith |
Subject: |
Re: simplifying configuration of encoded characters/entities output |
Date: |
Wed, 29 Dec 2021 15:50:50 +0000 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Wed, Dec 29, 2021 at 01:35:05PM +0100, Patrice Dumas wrote:
> Here is my proposal for HTML
> * remove FALLBACK_TO_NUMERIC_ENTITY, always setting it for HTML (and
> never for TexinfoXML, or always set, not sure about it, and probably
> does not matter much).
> * remove ENABLE_ENCODING_USE_ENTITY
> * if ENABLE_ENCODING is set, try to output unicode points encoded
> characters for every output, be it accents like @'e, @-commands like
> @l{} or dashes and quotes.
I'm happy with this.
I couldn't find much information online about whether using the
entities or using raw UTF-8 was better.
I did find this page:
https://docs.microsoft.com/en-us/troubleshoot/browsers/wrong-character-set-for-html-page
and I do remember seeing that some old browsers gave you the choice of
which encoding to use for a page. Hence, using entities seems like
a more reliable way of specifying a character, in case the page encoding
is set/detected incorrectly by some old browser.
If a document has a lot of non-ASCII characters (e.g. if it's written
in Chinese), then the behaviour you state with ENABLE_ENCODING would
be better.
Agreed that the choice for TexinfoXML doesn't matter.
>
> That would mean 3 possibilities for HTML
> * default, use named entities if possible, fallback to numeric entities
> * --enable-encoding triggers outputting encoded characters
> * with USE_NUMERIC_ENTITY output numeric entities
>
>
> Note than in most if not all cases, the actual output would still be
> guarded by the OUTPUT_ENCODING_NAME value, such that the conversions
> with ENABLE_ENCODING set are only done when they are known to be
> possible.
>
> Opinions, ideas?
>
> --
> Pat
>
Re: simplifying configuration of encoded characters/entities output,
Gavin Smith <=