[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte
From: |
Mike Miller |
Subject: |
[Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters |
Date: |
Sun, 28 Jul 2019 17:01:04 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36 |
Follow-up Comment #13, bug #35910 (project octave):
I'm seeing some strange regressions with this change in a minimal container
running Octave with some regular expressions containing UTF-8 characters (in
the doctest package).
Example regexerror.m:
c = regexp ('lorem ipsum', '^\s*(⇒|=>|⊣|-\|)', 'lineanchors');
$ octave regexerror.m
error: regexp: unrecognized character after (? or (?- at position 13 of
expression
error: called from
regexerror at line 1 column 3
I get different results depending on whether this script is run from the
command shell or in an interactive Octave, and whether the environment
contains LANG or LC_??? variables containing UTF-8 or not. This suggests that
there is something I could configure or install in my environment to fix this,
but I have no idea what that is at the moment. Any ideas?
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?35910>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/07/21
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Mike Miller, 2019/07/21
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/07/22
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Rik, 2019/07/22
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Mike Miller, 2019/07/22
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Andrew Janke, 2019/07/27
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters,
Mike Miller <=
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Mike Miller, 2019/07/28
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Markus Mützel, 2019/07/29
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Mike Miller, 2019/07/29
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Mike Miller, 2019/07/29
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Mike Miller, 2019/07/31
- [Octave-bug-tracker] [bug #35910] Incorrect regex matching of multi-byte UTF-8 characters, Rik, 2019/07/31