Re: Bug reported regarding Unicode handling in email address

nmh-workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bug reported regarding Unicode handling in email address

From:	Tom Lane
Subject:	Re: Bug reported regarding Unicode handling in email address
Date:	Mon, 07 Jun 2021 09:54:51 -0400

Ralph Corderoy <ralph@inputplus.co.uk> writes:
> U+0081 as 0x81 is ‘is a character representable as an unsigned char’ for
> it's a character, U+0081, and unsigned char holds [0, 0x100) so it
> suffers no loss of representation as an unsigned char.

Sure, but then what you are feeding the function is *not* UTF8.
UTF8 would require two bytes to represent that code point.  What
you're describing is ISO 8859-1, which is a perfectly fine
single-byte encoding, as long as you don't need any characters
outside the common western-European languages.

Or to put it another way: yes, you can claim that only code points
up to U+FF can be passed to these functions, but that hobbles things
to the point where you really shouldn't claim to be Unicode-aware
at all.

I think it's more sensible to consider that per spec, the <ctype.h>
functions can only deal with single-byte encodings; if you want
something more flexible, you have to go to <wctype.h>.

                        regards, tom lane

[Prev in Thread]

Current Thread

[Next in Thread]

Bug reported regarding Unicode handling in email address, Ken Hornstein, 2021/06/02
- Re: Bug reported regarding Unicode handling in email address, Tom Lane, 2021/06/02
  - Re: Bug reported regarding Unicode handling in email address, Ken Hornstein, 2021/06/02
    - Re: Bug reported regarding Unicode handling in email address, David Levine, 2021/06/02
    - Re: Bug reported regarding Unicode handling in email address, Tom Lane, 2021/06/02
    - Re: Bug reported regarding Unicode handling in email address, Ken Hornstein, 2021/06/02
    - Re: Bug reported regarding Unicode handling in email address, Ralph Corderoy, 2021/06/07
    - Re: Bug reported regarding Unicode handling in email address, Tom Lane <=
- Re: Bug reported regarding Unicode handling in email address, Valdis Klētnieks, 2021/06/02
  - Re: Bug reported regarding Unicode handling in email address, Ken Hornstein, 2021/06/02
    - Re: Bug reported regarding Unicode handling in email address, Bob Carragher, 2021/06/03
    - Re: Bug reported regarding Unicode handling in email address, Ralph Corderoy, 2021/06/07
    - Re: Bug reported regarding Unicode handling in email address, Ken Hornstein, 2021/06/07
    - Re: Bug reported regarding Unicode handling in email address, Ralph Corderoy, 2021/06/10
    - Re: Bug reported regarding Unicode handling in email address, Ken Hornstein, 2021/06/10
    - Re: Bug reported regarding Unicode handling in email address, Ralph Corderoy, 2021/06/11
    - Re: Bug reported regarding Unicode handling in email address, Ken Hornstein, 2021/06/11
    - Re: Bug reported regarding Unicode handling in email address, Ralph Corderoy, 2021/06/12

Prev by Date: Re: Bug reported regarding Unicode handling in email address
Next by Date: Re: Very large folderTo:
Previous by thread: Re: Bug reported regarding Unicode handling in email address
Next by thread: Re: Bug reported regarding Unicode handling in email address
Index(es):
- Date
- Thread