[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 inpu

octave-bug-tracker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 inpu

From:	Markus Mützel
Subject:	[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input
Date:	Wed, 23 Oct 2019 16:43:44 -0400 (EDT)
User-agent:	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0

Follow-up Comment #3, bug #57107 (project octave):

There is at least one modern OS that still uses 8bit encodings by default:
Windows 10 and its predecessors.
On a western locale the default encoding might well be ISO-8859-1 (or
ANSI/CP1252).

But I now see that this bug is marked as affecting GNU/Linux. So it will most
probably be necessary to specify the encoding when fopen'ing a file for
reading strings.

Matlab's internal encoding is 16bit wide (maybe UCS-2). Maybe it reads the
non-UTF-8 bytes as is and they "happen" to map the Unicode code points (for a
western encoded file).
I am not sure whether we should do something similar and transcode from a
default 8bit encoding if we detect that a source contains invalid UTF-8.

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?57107>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, A.R. Burgers, 2019/10/23
- [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/23
  - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/23
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel <=
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/23
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Michael Leitner, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24

Prev by Date: [Octave-bug-tracker] [bug #57087] [instrument-control] tests fail on macOS for tcp/udp read/write
Next by Date: [Octave-bug-tracker] [bug #57087] [instrument-control] tests fail on macOS for tcp/udp read/write
Previous by thread: [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input
Next by thread: [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input
Index(es):
- Date
- Thread