[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequenc
From: |
Chicken Trac |
Subject: |
[Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences |
Date: |
Fri, 27 Mar 2015 23:08:44 -0000 |
#1182: utf8 egg silently accepts invalid byte sequences
------------------------+---------------------------------------------------
Reporter: syn | Owner: ashinn
Type: defect | Status: new
Priority: major | Milestone: someday
Component: extensions | Version: 4.9.x
Keywords: utf8 |
------------------------+---------------------------------------------------
I noticed that some procedures of the `utf8` egg silently accept invalid
byte sequences. This might have some safety implications, e.g. consider
this case (the procedures used are the core versions, procedures from the
`utf8` egg are prefixed with `utf8-` in the following code snippets):
{{{
(define evil-quote
(list->string (map integer->char '(#b11000000 #b10100111))))
}}}
This is an invalid (overlong) UTF-8 encoding of the `'` character. Now a
program could perform a check like this to make sure a user supplied
string doesn't contain any quotes:
{{{
(unless (utf8-string-contains evil-quote "'") ...)
}}}
And then go ahead and write it character by character like this:
{{{
(utf8-string-for-each display evil-quote)
}}}
Which would produce the actual `'` character. The same is true for any
other procedure that produces characters from strings, e.g. `string-ref`,
`string->list`, etc.
Any other invalid byte sequence (such as stray continuation bytes) is also
silently accepted.
I'm not entirely sure what would be the wisest way to handle this. We
could have these procedures signal an error or just mention this behavior
in the documentation so that people know to perform validation on
untrusted inputs.
--
Ticket URL: <http://bugs.call-cc.org/ticket/1182>
CHICKEN Scheme <http://www.call-with-current-continuation.org/>
CHICKEN Scheme is a compiler for the Scheme programming language.
- [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences,
Chicken Trac <=
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/29
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/29
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/29
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/29
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/29
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/29
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/29
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/30
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/30
- Re: [Chicken-janitors] #1182: utf8 egg silently accepts invalid byte sequences, Chicken Trac, 2015/03/30