[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Help-smalltalk] [bug] regex doesn't support i (ignorecase) flag
From: |
Stephen Compall |
Subject: |
[Help-smalltalk] [bug] regex doesn't support i (ignorecase) flag |
Date: |
Mon, 01 Oct 2007 18:05:39 -0700 |
Issue status update for
http://smalltalk.gnu.org/node/85
Post a follow up:
http://smalltalk.gnu.org/project/comments/add/85
Project: GNU Smalltalk
Version: <none>
Component: VM
Category: bug reports
Priority: normal
Assigned to: Unassigned
Reported by: S11001001
Updated by: S11001001
Status: patch
Attachment: http://smalltalk.gnu.org/files/issues/latin1-re-ignorecase.patch
(2.58 KB)
Example:
st> ('a' =~ '(?i:A)') inspect!
An instance of Kernel.FailedMatchRegexResults
<!--break-->
I found that this is because pre_set_casetable in lib-src/regex.c is
never called. This is fixed in
address@hidden/smalltalk--backstage--2.2--patch-62*,
"support (?i:...) in regexps".
st> ('a' =~ '(?i:A)') inspect!
An instance of Kernel.MatchingRegexResults
There are multiple solution paths, because case folding is
charset-dependent. The patch implements #3:
* Always import I18N and use the locale database to determine the
charset of Strings. I'm not sure what the exact semantics of this
would be.
* Assume ASCII. regex.c already effectively assumes that strings
are somewhat ASCII-compatible, and this wouldn't bias in favor of a
particular ASCII superset.
* Assume Latin-1. This has the benefit of offering a clear
behavior path to future support for matching full Unicode strings, so
it's what the patch uses.
* Assume Latin-9. Technically this supersedes Latin-1, so is more
up-to-date, but is not a codepoint-wise subset of Unicode.
- [Help-smalltalk] [bug] regex doesn't support i (ignorecase) flag,
Stephen Compall <=