[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] [PATCH] First
From: |
Tim Ruehsen |
Subject: |
Re: [Bug-wget] [PATCH] First |
Date: |
Thu, 20 Nov 2014 10:15:21 +0100 |
User-agent: |
KMail/4.14.2 (Linux/3.16.0-4-amd64; KDE/4.14.2; x86_64; ; ) |
On Thursday 20 November 2014 11:18:07 Darshit Shah wrote:
> >Excepting perhaps init.c, all .c files at src/ are in fact expecting a
> >C-locale str(n)casecmp
> >(they all deal with network protocols).
>
> But URLs can be in non-ASCII characters. And similarly, if I remember
> correctly, the HTTP headers can contain data, especially about redirects
> and cookie information which is non-ASCII.
This is too general. Try to be more specific and (if possible) give an
example.
Wget compares standardized ASCII values, e.g. 'Content-Disposition' or
'domain', mostly keys. Domains/hostnames should be normalized to ASCII
(idn_...) immediately when they come in (together with a locale). But I am not
sure if Wget does this consequently.
> In such scenarios how does using the C Locale comparison work out? Unless
> that data is somehow first normalized to some ASCII compatible string.
>
> Honestly, localization isn't something I know much about. But I'd like to
> know how this works.
Well, str(n)casecmp() assumes the two input strings being in the current
locale (the one that the program runs with). It compares the lower (or upper)
case representation of these strings.
Well I gave an example and two explaining links here:
http://lists.gnu.org/archive/html/bug-wget/2014-09/msg00024.html
signature.asc
Description: This is a digitally signed message part.