bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Redirect containing %2B behaves differently depending on loca


From: Adam Sampson
Subject: [Bug-wget] Redirect containing %2B behaves differently depending on locale
Date: Fri, 13 Mar 2015 22:48:28 +0000
User-agent: Mutt/1.5.23+28 (79ea10b2d81c) (2014-03-12)

Hi,

I've just found a case where wget 1.16.3 responds to a 302 redirect
differently depending on whether it's in an ASCII or UTF-8 locale.

This works:
LC_ALL=en_GB.UTF-8 wget 
https://bitbucket.org/pypy/pypy/downloads/pypy-2.5.0-src.tar.bz2

This doesn't work:
LC_ALL=C wget https://bitbucket.org/pypy/pypy/downloads/pypy-2.5.0-src.tar.bz2

I've attached logs with -d showing what's actually going on. The
initial request gives a 302 response with a Location: that contains:
  ....tar.bz2?Signature=up6%2BtTpSF...

In the UTF-8 locale, wget correctly redirects to that location.

In the ASCII locale, wget -d print a "converted: '...' -> '...'" line
(from iri.c's do_conversion), then redirects to:
  ....tar.bz2?Signature=up6+tTpSF...

(If you try it yourself you'll get a slightly different URL, but at
least for me it usually contains %2B somewhere.)

This appears to be because do_conversion calls url_unescape on the
input string it's given -- even though that input string is a _const_
char * in the code that calls it (main -> retrieve_url -> url_parse ->
remote_to_utf8 -> do_conversion). It's not immediately obvious to me
whether that's intentional or not; at the very least, it's a surprising
bit of behaviour.

Thanks,

-- 
Adam Sampson <address@hidden>                         <http://offog.org/>

Attachment: out.C
Description: Text document

Attachment: out.en_GB.UTF-8
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]