octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte


From: Markus Mützel
Subject: [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters
Date: Thu, 17 Oct 2019 11:14:12 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0

Follow-up Comment #23, bug #35910 (project octave):

This is a general difference between Octave and Matlab not only affecting this
function:
Matlab seems to use UCS-2 to encode characters (a subset of UTF-16). Octave
uses UTF-8 (in more and more places - maybe consistently one day).

Maybe you could try the following?

regexprep (native2unicode (181, 'latin1'), 'u', 'micro')


The characters 0-255 of Unicode "incidentally" equate to Latin-1. That's why
it happens to work in Matlab.

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?35910>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]