[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Libcdio-devel] How tolerant to be towards CD-TEXT character set mis
From: |
Thomas Schmitt |
Subject: |
Re: [Libcdio-devel] How tolerant to be towards CD-TEXT character set mislabeling ? |
Date: |
Sun, 28 Apr 2019 13:40:01 +0200 |
Hi,
Ludolf Holzheid wrote:
> I would vote for even switching to CP1252, which further extends
> ISO-8859-1.
A good point, provided that it is really true that each ISO-8859-1
character has the same byte code in CP1252.
The following documents confirm this:
https://www.i18nqa.com/debug/table-iso8859-1-vs-windows-1252.html
https://en.wikipedia.org/wiki/ISO/IEC_8859-1#Windows-1252
Afaik, no ISO-8859-X X!=1 is a true superset of ISO-8859-1.
So let's consider to replace "ASCII" and "ISO-8859-1" by "CP1252" for
outmost reader tolerance:
case CDTEXT_CHARCODE_ISO_8859_1:
/* default */
/* ISO-8859-1 is a subset of CP1252. If non-ISO-8859-1 are
* present against CD-TEXT specification, CP1252 gives more hope
* for a readable result than telling iconv to be picky.
*/
charset = (char *) "CP1252";
break;
case CDTEXT_CHARCODE_ASCII:
/* ASCII is a subset of ISO-8859-1. Some CDs announce it but then
* have 8-bit characters in their text. Trying CP1252 gives
* more hope for a readable result than telling iconv to be picky.
*/
charset = (char *) "CP1252"
But other than with "ISO-8859-1", which was already tested extensively,
libcdio never converted CD-TEXT from CP1252 up to now.
So there should be some extra testing.
To Serge (the bug reporter):
Will you find time and maybe a few more audio CDs to test it ?
Have a nice day :)
Thomas
Re: [Libcdio-devel] How tolerant to be towards CD-TEXT character set mislabeling ?, Leon Merten Lohse, 2019/04/29