[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
unistr/u8-strcoll: make result more predictable
From: |
Bruno Haible |
Subject: |
unistr/u8-strcoll: make result more predictable |
Date: |
Mon, 24 May 2010 23:04:53 +0200 |
User-agent: |
KMail/1.9.9 |
On NetBSD 5.0, the u8-strcoll test failed, because it used iconv with
transliteration, and the results depend too much on the iconv implementation
being used: "•" maps to "o" with glibc or libiconv, but to "?" with NetBSD
iconv. The fix is to rely only on strict (lossless) iconv conversion.
2010-05-24 Bruno Haible <address@hidden>
Don't use conversion with transliteration in u{8,16,32}_strcoll.
* lib/unistr/u-strcoll.h (FUNC): Use U_STRCONV_TO_ENCODING with
iconveh_error argument.
* lib/unistr/u8-strcoll.c: Define U_STRCONV_TO_ENCODING instead of
U_STRCONV_TO_LOCALE.
* lib/unistr/u16-strcoll.c: Likewise.
* lib/unistr/u32-strcoll.c: Likewise.
* modules/unistr/u8-strcoll (Depends-on): Add
uniconv/u8-strconv-to-enc, localcharset. Remove
uniconv/u8-strconv-to-locale.
(configure.ac): Bump version number.
* modules/unistr/u16-strcoll (Depends-on): Add
uniconv/u16-strconv-to-enc, localcharset. Remove
uniconv/u16-strconv-to-locale.
(configure.ac): Bump version number.
* modules/unistr/u32-strcoll (Depends-on): Add
uniconv/u32-strconv-to-enc, localcharset. Remove
uniconv/u32-strconv-to-locale.
(configure.ac): Bump version number.
--- lib/unistr/u-strcoll.h.orig Mon May 24 22:55:35 2010
+++ lib/unistr/u-strcoll.h Mon May 24 22:43:30 2010
@@ -23,14 +23,19 @@
When it fails, it sets errno, but also returns a meaningful return value,
for the sake of callers which ignore errno. */
int final_errno = errno;
+ const char *encoding = locale_charset ();
char *sl1;
char *sl2;
int result;
- sl1 = U_STRCONV_TO_LOCALE (s1);
+ /* Pass iconveh_error here, not iconveh_question_mark. Otherwise the
+ conversion to locale encoding can do transliteration or map some
+ characters to question marks, leading to results that depend on the
+ iconv() implementation and are not obvious. */
+ sl1 = U_STRCONV_TO_ENCODING (s1, encoding, iconveh_error);
if (sl1 != NULL)
{
- sl2 = U_STRCONV_TO_LOCALE (s2);
+ sl2 = U_STRCONV_TO_ENCODING (s2, encoding, iconveh_error);
if (sl2 != NULL)
{
/* Compare sl1 and sl2. */
@@ -41,10 +46,10 @@
/* strcoll succeeded. */
free (sl1);
free (sl2);
- /* The conversion to locale encoding can do transliteration or
- map some characters to question marks. Therefore sl1 and sl2
- may be equal when s1 and s2 were in fact different. Return a
- nonzero result in this case. */
+ /* The conversion to locale encoding can drop Unicode TAG
+ characters. Therefore sl1 and sl2 may be equal when s1
+ and s2 were in fact different. Return a nonzero result
+ in this case. */
if (result == 0)
result = U_STRCMP (s1, s2);
}
@@ -68,7 +73,7 @@
else
{
final_errno = errno;
- sl2 = U_STRCONV_TO_LOCALE (s2);
+ sl2 = U_STRCONV_TO_ENCODING (s2, encoding, iconveh_error);
if (sl2 != NULL)
{
/* s2 could be converted to locale encoding, s1 not. */
--- lib/unistr/u8-strcoll.c.orig Mon May 24 22:55:35 2010
+++ lib/unistr/u8-strcoll.c Mon May 24 22:43:33 2010
@@ -29,5 +29,5 @@
#define FUNC u8_strcoll
#define UNIT uint8_t
#define U_STRCMP u8_strcmp
-#define U_STRCONV_TO_LOCALE u8_strconv_to_locale
+#define U_STRCONV_TO_ENCODING u8_strconv_to_encoding
#include "u-strcoll.h"
--- lib/unistr/u16-strcoll.c.orig Mon May 24 22:55:35 2010
+++ lib/unistr/u16-strcoll.c Mon May 24 22:43:35 2010
@@ -29,5 +29,5 @@
#define FUNC u16_strcoll
#define UNIT uint16_t
#define U_STRCMP u16_strcmp
-#define U_STRCONV_TO_LOCALE u16_strconv_to_locale
+#define U_STRCONV_TO_ENCODING u16_strconv_to_encoding
#include "u-strcoll.h"
--- lib/unistr/u32-strcoll.c.orig Mon May 24 22:55:35 2010
+++ lib/unistr/u32-strcoll.c Mon May 24 22:43:34 2010
@@ -29,5 +29,5 @@
#define FUNC u32_strcoll
#define UNIT uint32_t
#define U_STRCMP u32_strcmp
-#define U_STRCONV_TO_LOCALE u32_strconv_to_locale
+#define U_STRCONV_TO_ENCODING u32_strconv_to_encoding
#include "u-strcoll.h"
--- modules/unistr/u8-strcoll.orig Mon May 24 22:55:35 2010
+++ modules/unistr/u8-strcoll Mon May 24 22:46:40 2010
@@ -8,10 +8,11 @@
Depends-on:
unistr/base
unistr/u8-strcmp
-uniconv/u8-strconv-to-locale
+uniconv/u8-strconv-to-enc
+localcharset
configure.ac:
-gl_LIBUNISTRING_LIBSOURCE([0.9.3], [unistr/u8-strcoll.c])
+gl_LIBUNISTRING_LIBSOURCE([0.9.4], [unistr/u8-strcoll.c])
Makefile.am:
--- modules/unistr/u16-strcoll.orig Mon May 24 22:55:35 2010
+++ modules/unistr/u16-strcoll Mon May 24 22:46:34 2010
@@ -8,10 +8,11 @@
Depends-on:
unistr/base
unistr/u16-strcmp
-uniconv/u16-strconv-to-locale
+uniconv/u16-strconv-to-enc
+localcharset
configure.ac:
-gl_LIBUNISTRING_LIBSOURCE([0.9.3], [unistr/u16-strcoll.c])
+gl_LIBUNISTRING_LIBSOURCE([0.9.4], [unistr/u16-strcoll.c])
Makefile.am:
--- modules/unistr/u32-strcoll.orig Mon May 24 22:55:35 2010
+++ modules/unistr/u32-strcoll Mon May 24 22:46:29 2010
@@ -8,10 +8,11 @@
Depends-on:
unistr/base
unistr/u32-strcmp
-uniconv/u32-strconv-to-locale
+uniconv/u32-strconv-to-enc
+localcharset
configure.ac:
-gl_LIBUNISTRING_LIBSOURCE([0.9.3], [unistr/u32-strcoll.c])
+gl_LIBUNISTRING_LIBSOURCE([0.9.4], [unistr/u32-strcoll.c])
Makefile.am:
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- unistr/u8-strcoll: make result more predictable,
Bruno Haible <=