[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#34492: rx: ASCII-raw byte ranges comprise all of Unicode
From: |
Mattias Engdegård |
Subject: |
bug#34492: rx: ASCII-raw byte ranges comprise all of Unicode |
Date: |
Fri, 15 Feb 2019 19:23:56 +0100 |
`rx' incorrectly considers character ranges between ASCII and raw bytes to
cover all codes in-between, which includes all non-ASCII Unicode chars.
This causes (any "\000-\377" ?Å) to be simplified to (any "\000-\377"), which
is not at all the same thing: [\000-\377] really means [\000-\177\200-\377] --
the transformation is normally made by the Emacs regexp engine. The two ranges
are not contiguous on the character code level.
It's a sleeper bug that was awakened by my fixing bug#33205, so I'm to blame
for not checking this.
- bug#34492: rx: ASCII-raw byte ranges comprise all of Unicode,
Mattias Engdegård <=
- Message not available
- bug#34492: Acknowledgement (rx: ASCII-raw byte ranges comprise all of Unicode), Mattias Engdegård, 2019/02/15
- bug#34492: Acknowledgement (rx: ASCII-raw byte ranges comprise all of Unicode), Eli Zaretskii, 2019/02/16
- bug#34492: Acknowledgement (rx: ASCII-raw byte ranges comprise all of Unicode), Mattias Engdegård, 2019/02/16
- bug#34492: Acknowledgement (rx: ASCII-raw byte ranges comprise all of Unicode), Eli Zaretskii, 2019/02/16
- bug#34492: Acknowledgement (rx: ASCII-raw byte ranges comprise all of Unicode), Mattias Engdegård, 2019/02/16
- bug#34492: Acknowledgement (rx: ASCII-raw byte ranges comprise all of Unicode), Eli Zaretskii, 2019/02/16
- bug#34492: Acknowledgement (rx: ASCII-raw byte ranges comprise all of Unicode), Mattias Engdegård, 2019/02/16