bug-glibc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

wchar_t not Unicode?


From: Simon Marlow
Subject: wchar_t not Unicode?
Date: Wed, 27 Aug 2003 11:04:02 +0100

Glibc defines __STDC_ISO_10646__, which indicates that the wchar_t type
represents UCS-4, or Unicode.  However, this doesn't appear to be
consistently the case.  Take the following example:

#include <wctype.h>
#include <stdio.h>
#include <locale.h>

main() {
    setlocale(LC_ALL,"");
    printf("%d\n", iswupper('A'));
    printf("%d\n", iswupper(0x391)); // Greek capital alpha
    printf("%d\n", iswupper(0x3B1)); // Greek small alpha
    printf("%d\n", iswlower(0x391)); // Greek capital alpha
    printf("%d\n", iswlower(0x3B1)); // Greek small alpha
}

If I run this code with LANG set to C, then iswupper() and iswlower()
don't recognise the non-ASCII characters.  However, if I set the locale
to something other than C/POSIX, everything is fine.

The C99 standard is a little bit vague on whether this behaviour is
allowed or not.  However, since wchar_t is intended to be consistently
UCS-4, it would seem to make sense for iswupper(), iswlower() and
friends to have locale-independent meanings.

Cheers,
        Simon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]