bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip


From: Tim Rühsen
Subject: Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip-compressed tarballs (HTTP-header dependent)
Date: Fri, 3 Nov 2017 10:00:25 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

On 11/03/2017 06:37 AM, James Cloos wrote:
>>>>>> "TR" == Tim Rühsen <address@hidden> writes:
> 
> TR> I downloaded/tested thousands of web pages and they behave as if 'Content-
> TR> Encoding: gzip' is a compression for the transport. Uncompressing it 
> 'on-the-
> TR> fly' and saving that uncompressed data was the correct behavior.
> 
> Lots of servers have that misconfiguration; it was recommended in the
> past and apache defaulted to doing that when grabbing things like tar.gz.
> 
> The gui browsers had to learn to work around that misconfig.  wget also
> has to.
> 
> In short, do not uncompress if the destination name has a compression
> suffix.
> 
> Or, in that case, test whether the uncompressed data starts with gzip
> magic and complete one decompression if so, non if not so.
> 
> And the same for the other compression formats.

Thanks for this insight !

Looking at the Mozilla/Gecko sources shows that gzip Content-Encoding is
just cleared for Content-Types application/x-gzip, application/gzip and
application/x-gunzip. That makes it straight forward to go that way.

With Best Regards, Tim

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]