|
From: | Eric Blake |
Subject: | Re: character ranges in regular expressions |
Date: | Mon, 04 Oct 2010 14:51:00 -0600 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc13 Mnenhy/0.8.3 Thunderbird/3.1.4 |
On 10/04/2010 02:43 PM, Aharon Robbins wrote:
Which is why my proposal is that glibc consider: [A-Z] => match C locale; 26 letters, regardless of locale [[.A.]-[.Z.]] => use collation rules, since we explicitly spelled things with collation symbols (26 letters in POSIX local, 51 or even more in other locales, since accented characters might be included in the collation range), so that we aren't completely losing CEO behavior (if someone seriously has a reason to use it) [[:upper:]] => per POSIX rules in all localesThis would be great. In what must be close to (or more than) the 10 years since gawk started supporting locales, I have yet to meet anyone who thinks that [a-z] matching [A-Y] is a feature!
Great idea or not, Uli rejected it :(------- Additional Comments From drepper dot fsp at gmail dot com 2010-10-04 02:42 ------- This stays as it is. If individual locale maintainers think the current behavior is unintentionally as-is then they can change it. But in general this is the long-implemented behavior and won't be changed. Collating elements are just not really useful outside the POSIX locale or when the locale is guaranteed to stay
the same.-- What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX http://sourceware.org/bugzilla/show_bug.cgi?id=12051
-- Eric Blake address@hidden +1-801-349-2682 Libvirt virtualization library http://libvirt.org
[Prev in Thread] | Current Thread | [Next in Thread] |