bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile 1.8.5 test failure: srfi-14.test


From: Ludovic Courtès
Subject: Re: guile 1.8.5 test failure: srfi-14.test
Date: Sun, 18 May 2008 23:04:05 +0200
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux)

Hi Bruno,

Bruno Haible <address@hidden> writes:

> On my system, (find-latin1-locale) returns "de_DE.iso88591". Now look:
> $ guile
> guile> (char-set-size char-set:letter)
> 52
> guile> (setlocale LC_CTYPE "de_DE.iso88591")
> "de_DE.iso88591"
> guile> (char-set-size char-set:letter)
> 124
> guile> char-set:letter
> #<charset {#\A #\B #\C #\D #\E #\F #\G #\H #\I #\J #\K #\L #\M #\N #\O #\P 
> #\Q #\R #\S #\T #\U #\V #\W #\X #\Y #\Z #\a #\b #\c #\d #\e #\f #\g #\h #\i 
> #\j #\k #\l #\m #\n #\o #\p #\q #\r #\s #\t #\u #\v #\w #\x #\y #\z #\246 
> #\250 #\252 #\264 #\265 #\270 #\272 #\274 #\275 #\276 #\300 #\301 #\302 #\303 
> #\304 #\305 #\306 #\307 #\310 #\311 #\312 #\313 #\314 #\315 #\316 #\317 #\320 
> #\321 #\322 #\323 #\324 #\325 #\326 #\330 #\331 #\332 #\333 #\334 #\335 #\336 
> #\337 #\340 #\341 #\342 #\343 #\344 #\345 #\346 #\347 #\350 #\351 #\352 #\353 
> #\354 #\355 #\356 #\357 #\360 #\361 #\362 #\363 #\364 #\365 #\366 #\370 #\371 
> #\372 #\373 #\374 #\375 #\376 #\377}>
>
> So the notion of "letters" in a Latin1 locale may depend on the libc.
> It might be safer to change the test code from
>
>     (= (char-set-size char-set:letter) 117)
>
> to
>
>     (>= (char-set-size char-set:letter) 100)

The cardinals of these char sets were taken from SRFI-14:

  http://srfi.schemers.org/srfi-14/srfi-14.html#StandardCharsetDefs

This indicates that we should fix our SRFI-14 implementation, not the
test.  ;-)

The system I'm currently using also picks `de_DE.iso88591' but it uses
Glibc 2.7, which doesn't have this problem.  I'm pretty sure Glibc 2.5
didn't have this problem either, and FreeBSD 6.2's libc doesn't either.
I don't have any Glibc 2.3-based system at hand, so I can only try to
guess what's going on.

Glibc's `localedata/locales/i18n' appears to be what defines the
character classes.  According to the ChangeLog it was updated in
Feb. 2007 to match Unicode 5.0, and in Apr. 2002 (by you) to match
Unicode 3.2.  Glibc 2.3.6 was released sometime in 2005 (see
http://sourceware.org/ml/libc-announce/2005/msg00001.html), so it
included the latter.

The SRFI-14 locale-sensitive code in Guile and the corresponding tests
date back to Sept. 2006, so it seems unlikely that the Unicode 5.0
update changed anything.  Any idea what to look at?

(Of course, we should eventually use `UnicodeData.txt' directly but
that's not likely to happen anytime soon...)

Thanks,
Ludovic.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]