Re: Unicode support in io Forge package

Andrew,

Iirc, the interface uses UTF-16. The conversion function really only works for Latin-1 encoded input.
There really is no UTF-8 in this. TBH, I chose the name before I got a sufficient grasp of that encoding mess.

Forge packages usually target a wider range of Octave versions. I don't know whether this workaround can be safely removed without loosing support for Latin-1 in older Octave versions supported by io.
I didn't re-read the code. But believe that "unicode2native" is used if it is available.

Markus

PS: Sorry for top-posting. My mobile phone app doesn't allow otherwise.
--
Diese Nachricht wurde von meinem Android Mobiltelefon mit GMX Mail gesendet.

Am 19.10.19, 07:04, Andrew Janke <address@hidden> schrieb:

Hi, Octave and io maintainers,

I'm confused by the Unicode support in the io package. In particular,
the functions unicode2utf8 and utf82unicode, and the "encode_utf"
options in some of the ods/xls read/write functions.

What is the encoding that utf82unicode/unicode2utf8 are calling
"unicode" here? It looks like it's doing a single-byte encoding,
treating each byte as an unsigned int 0-255, and treating those 0-255
values directly as Unicode code point values. That's not any of the
standard Unicode encodings. (But I think it is exactly the same as
Latin-1/ISO 8859-1.)

As I understand it, since about Octave 4.4, Octave's internal encoding
(that is, how it interprets Octave char arrays) is either UTF-8 or an
opaque array of bytes; it's never in the "system code page" or some
other locale-specific encoding.

Is this UTF-8 support in io still relevant/correct? Maybe it should be
deprecated or renamed/removed? Since Octave now supports UTF-8, I think
you'd want to just leave UTF-8 text as is in all cases.

Cheers,
Andrew

From:	Markus Mützel
Subject:	Re: Unicode support in io Forge package
Date:	Sat, 19 Oct 2019 08:16:42 +0200