|
From: | Paolo Bonzini |
Subject: | Re: [PATCH v2] dfa: optimize UTF-8 period |
Date: | Tue, 20 Apr 2010 11:12:10 +0200 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.8) Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3 |
On 04/20/2010 12:47 AM, Eric Blake wrote:
On 04/19/2010 06:14 AM, Paolo Bonzini wrote:+ /* A valid UTF-8 character is + + ([0x00-0x7f] + |[0xc2-0xdf][0x80-0xbf] + |[0xe0-0xef[0x80-0xbf][0x80-0xbf] + |[0xf0-f7][0x80-0xbf][0x80-0xbf][0x80-0xbf])Yes, but in POSIX XBD 9.3.4, http://www.opengroup.org/onlinepubs/9699919799/toc.htm, the ANYCHAR does not match NUL. Do you need to adjust this patch to exclude 0x00?
Yes (following the syntax bits). Does this seem okay? Paolo
ff.patch
Description: Text document
[Prev in Thread] | Current Thread | [Next in Thread] |