[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Guile-commits] 01/02: Fix some invalid unicode handling issues with sus
From: |
Ludovic Courtès |
Subject: |
[Guile-commits] 01/02: Fix some invalid unicode handling issues with suspendable ports. |
Date: |
Mon, 20 Mar 2023 18:25:33 -0400 (EDT) |
civodul pushed a commit to branch main
in repository guile.
commit cba2e7e3fec3c781230570f5d1ef070625eeeda8
Author: Christopher Baines <mail@cbaines.net>
AuthorDate: Mon Mar 20 09:15:13 2023 +0000
Fix some invalid unicode handling issues with suspendable ports.
Fixes <https://bugs.gnu.org/62290>.
Based on the implementation in ports.c. I don't understand what this
code is really doing, but the suspendable ports implementation differs
from the similar C code for a couple of inequalities.
* module/ice-9/suspendable-ports.scm (decode-utf8, bad-utf8-len): Flip a
couple of inequalities.
* test-suite/tests/ports.test ("string ports"): Add additional invalid
UTF-8 test case.
* NEWS: Update.
Signed-off-by: Ludovic Courtès <ludo@gnu.org>
---
NEWS | 3 +++
module/ice-9/suspendable-ports.scm | 8 ++++----
test-suite/tests/ports.test | 7 +++++++
3 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/NEWS b/NEWS
index a55cb583b..167b0f2ad 100644
--- a/NEWS
+++ b/NEWS
@@ -23,6 +23,9 @@ the compiler reports it as "possibly unused".
* Bug fixes
+** (ice-9 suspendable-ports) incorrect UTF-8 decoding
+ (https://bugs.gnu.org/62290)
+
* Hashing of UTF-8 symbols with non-ASCII characters avoids corruption
This issue could cause `scm_from_utf8_symbol' and
diff --git a/module/ice-9/suspendable-ports.scm
b/module/ice-9/suspendable-ports.scm
index a823f1d37..9fac1df62 100644
--- a/module/ice-9/suspendable-ports.scm
+++ b/module/ice-9/suspendable-ports.scm
@@ -419,7 +419,7 @@
(= (logand u8_2 #xc0) #x80)
(case u8_0
((#xe0) (>= u8_1 #xa0))
- ((#xed) (>= u8_1 #x9f))
+ ((#xed) (<= u8_1 #x9f))
(else #t)))
(kt (integer->char
(logior (ash (logand u8_0 #x0f) 12)
@@ -436,7 +436,7 @@
(= (logand u8_3 #xc0) #x80)
(case u8_0
((#xf0) (>= u8_1 #x90))
- ((#xf4) (>= u8_1 #x8f))
+ ((#xf4) (<= u8_1 #x8f))
(else #t)))
(kt (integer->char
(logior (ash (logand u8_0 #x07) 18)
@@ -462,7 +462,7 @@
((< buffering 2) 1)
((not (= (logand (ref 1) #xc0) #x80)) 1)
((and (eq? first-byte #xe0) (< (ref 1) #xa0)) 1)
- ((and (eq? first-byte #xed) (< (ref 1) #x9f)) 1)
+ ((and (eq? first-byte #xed) (> (ref 1) #x9f)) 1)
((< buffering 3) 2)
((not (= (logand (ref 2) #xc0) #x80)) 2)
(else 0)))
@@ -471,7 +471,7 @@
((< buffering 2) 1)
((not (= (logand (ref 1) #xc0) #x80)) 1)
((and (eq? first-byte #xf0) (< (ref 1) #x90)) 1)
- ((and (eq? first-byte #xf4) (< (ref 1) #x8f)) 1)
+ ((and (eq? first-byte #xf4) (> (ref 1) #x8f)) 1)
((< buffering 3) 2)
((not (= (logand (ref 2) #xc0) #x80)) 2)
((< buffering 4) 3)
diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test
index 66e10e3dd..1b30e1a68 100644
--- a/test-suite/tests/ports.test
+++ b/test-suite/tests/ports.test
@@ -1059,6 +1059,13 @@
eof))
(test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8"
+ (error ;; 2nd byte should be in the 90..BF range
+ error ;; 88: not a valid starting byte
+ error ;; 88: not a valid starting byte
+ error ;; 88: not a valid starting byte
+ eof))
+
+ (test-decoding-error (#xf4 #xa4 #xbd #xa4) "UTF-8"
(error ;; 2nd byte should be in the 90..BF range
error ;; 88: not a valid starting byte
error ;; 88: not a valid starting byte