[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released)
From: |
Eli Zaretskii |
Subject: |
Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released) |
Date: |
Mon, 14 Dec 2015 18:33:38 +0200 |
> Date: Sun, 13 Dec 2015 20:04:31 +0100
> From: "Andries E. Brouwer" <address@hidden>
> Cc: "Andries E. Brouwer" <address@hidden>, address@hidden
>
> On Sun, Dec 13, 2015 at 08:01:27PM +0200, Eli Zaretskii wrote:
>
> > If no one is going to pick up the gauntlet, I will sit down and do it
> > myself, although I'm terribly busy with Emacs 25.1 release.
>
> Good!
While working on this, I bumped into 2 related issues:
1. The functions that call 'iconv' (in iri.c) don't make a point of
flushing the last portion of the converted URL after 'iconv'
returns successfully having converted the input string in its
entirety. IME, you need then to call 'iconv' one last time with
either the 2nd or the 3rd argument set to NULL, otherwise
sometimes the last converted character doesn't get output. In my
case, some URLs converted from CP1255 to UTF-8 lost their last
character. It sounds like no one has actually used this
conversion in iri.c, except for trivially converting UTF-8 to
itself. Is that possible/reasonable?
2. Wget assumes that the URL given on its command line is encoded in
the locale's encoding. This is a good assumption when the user
herself types the URL at the shell prompt, but not when the URL is
copy-pasted from a browser's address bar. In the latter case, the
URL tends to be in UTF-8 (sometimes hex-encoded). At least that's
what I get from Firefox. We don't seem to have in wget any
facilities to specify a separate (3rd) encoding for the URLs on
the command line, do we?
Thanks.
- [Bug-wget] GNU wget 1.17.1 released, Giuseppe Scrivano, 2015/12/11
- Re: [Bug-wget] GNU wget 1.17.1 released, Andries E. Brouwer, 2015/12/11
- Re: [Bug-wget] GNU wget 1.17.1 released, Ander Juaristi, 2015/12/13
- Re: [Bug-wget] GNU wget 1.17.1 released, Tim Rühsen, 2015/12/13
- Re: [Bug-wget] GNU wget 1.17.1 released, Eli Zaretskii, 2015/12/13
- Re: [Bug-wget] GNU wget 1.17.1 released, Tim Rühsen, 2015/12/13
- Re: [Bug-wget] GNU wget 1.17.1 released, Eli Zaretskii, 2015/12/13
- Re: [Bug-wget] GNU wget 1.17.1 released, Andries E. Brouwer, 2015/12/13
- Re: [Bug-wget] GNU wget 1.17.1 released, Eli Zaretskii, 2015/12/13
- Re: [Bug-wget] GNU wget 1.17.1 released, Andries E. Brouwer, 2015/12/13
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released),
Eli Zaretskii <=
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Rühsen, 2015/12/14
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Eli Zaretskii, 2015/12/14
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Rühsen, 2015/12/14
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Eli Zaretskii, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/17
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Andries E. Brouwer, 2015/12/15
- Re: [Bug-wget] URL encoding issues (Was: GNU wget 1.17.1 released), Tim Ruehsen, 2015/12/15