Am 19.10.19, 07:04, Andrew Janke <address@hidden> schrieb:
Hi, Octave and io maintainers,
I'm confused by the Unicode support in the io package. In particular,
the functions unicode2utf8 and utf82unicode, and the "encode_utf"
options in some of the ods/xls read/write functions.
What is the encoding that utf82unicode/unicode2utf8 are calling
"unicode" here? It looks like it's doing a single-byte encoding,
treating each byte as an unsigned int 0-255, and treating those 0-255
values directly as Unicode code point values. That's not any of the
standard Unicode encodings. (But I think it is exactly the same as
Latin-1/ISO 8859-1.)
As I understand it, since about Octave 4.4, Octave's internal encoding
(that is, how it interprets Octave char arrays) is either UTF-8 or an
opaque array of bytes; it's never in the "system code page" or some
other locale-specific encoding.
Is this UTF-8 support in io still relevant/correct? Maybe it should be
deprecated or renamed/removed? Since Octave now supports UTF-8, I think
you'd want to just leave UTF-8 text as is in all cases.
Cheers,
Andrew