[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip
Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip-compressed tarballs (HTTP-header dependent)
Fri, 3 Nov 2017 10:00:25 +0100
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0
On 11/03/2017 06:37 AM, James Cloos wrote:
>>>>>> "TR" == Tim Rühsen <address@hidden> writes:
> TR> I downloaded/tested thousands of web pages and they behave as if 'Content-
> TR> Encoding: gzip' is a compression for the transport. Uncompressing it
> TR> fly' and saving that uncompressed data was the correct behavior.
> Lots of servers have that misconfiguration; it was recommended in the
> past and apache defaulted to doing that when grabbing things like tar.gz.
> The gui browsers had to learn to work around that misconfig. wget also
> has to.
> In short, do not uncompress if the destination name has a compression
> Or, in that case, test whether the uncompressed data starts with gzip
> magic and complete one decompression if so, non if not so.
> And the same for the other compression formats.
Thanks for this insight !
Looking at the Mozilla/Gecko sources shows that gzip Content-Encoding is
just cleared for Content-Types application/x-gzip, application/gzip and
application/x-gunzip. That makes it straight forward to go that way.
With Best Regards, Tim
Description: OpenPGP digital signature