[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte
From: |
Markus Mützel |
Subject: |
[Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters |
Date: |
Thu, 17 Oct 2019 11:14:12 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0 |
Follow-up Comment #23, bug #35910 (project octave):
This is a general difference between Octave and Matlab not only affecting this
function:
Matlab seems to use UCS-2 to encode characters (a subset of UTF-16). Octave
uses UTF-8 (in more and more places - maybe consistently one day).
Maybe you could try the following?
regexprep (native2unicode (181, 'latin1'), 'u', 'micro')
The characters 0-255 of Unicode "incidentally" equate to Latin-1. That's why
it happens to work in Matlab.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?35910>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Guillaume, 2019/10/17
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters,
Markus Mützel <=
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/10/17
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/10/17
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Guillaume, 2019/10/18
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/10/18
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Rik, 2019/10/21
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/10/22
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Rik, 2019/10/22
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/10/26