[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 inpu

octave-bug-tracker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 inpu

From:	Markus Mützel
Subject:	[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input
Date:	Thu, 24 Oct 2019 10:17:45 -0400 (EDT)
User-agent:	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:71.0) Gecko/20100101 Firefox/71.0

Follow-up Comment #10, bug #57107 (project octave):

After a little research, I don't think that we should sniff the encoding.
Instead we might want to select one of the fallback options for decoding
invalid UTF-8 byte sequences [1].

I'd personally vote for the option:
"The Unicode code points U+0080–U+00FF with the same value as the byte, thus
interpreting the bytes according to ISO-8859-1."

That also most closely matches what Matlab seems to be doing. And it would
also solve the OR.

[1]: https://en.wikipedia.org/wiki/UTF-8#Invalid_byte_sequences

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?57107>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, A.R. Burgers, 2019/10/23
- [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/23
  - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/23
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/23
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/23
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Michael Leitner, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel <=
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Andrew Janke, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/24
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/25
    - [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input, Markus Mützel, 2019/10/26

Prev by Date: [Octave-bug-tracker] [bug #55970] Cannot copy figure to clipboard
Next by Date: [Octave-bug-tracker] [bug #55970] Cannot copy figure to clipboard
Previous by thread: [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input
Next by thread: [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input
Index(es):
- Date
- Thread