[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip
From: |
Tim Rühsen |
Subject: |
Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip-compressed tarballs (HTTP-header dependent) |
Date: |
Thu, 02 Nov 2017 20:57:28 +0100 |
User-agent: |
KMail/5.2.3 (Linux/4.13.0-1-amd64; KDE/5.37.0; x86_64; ; ) |
On Mittwoch, 1. November 2017 22:21:38 CET Daniel Stenberg wrote:
> On Wed, 1 Nov 2017, Tim Rühsen wrote:
> > Content-Encoding: gzip means that the data has been compressed for
> > transportation purposes only.
>
> That's actually not what it means. There's transfer-encoding for that
> purpose, but that's not generally supported by clients.
I didn't want to over-complicate things. What I indeed didn't remember was
that Transfer-Encoding allows 'gzip' (even in combination with chunked):
https://tools.ietf.org/html/rfc7230#section-3.3.1
> RFC7231 section 3.1.2.1 [*] says this:
>
> Content coding values indicate an encoding transformation that has
> been or can be applied to a representation.
>
> [*] = https://tools.ietf.org/html/rfc7231#section-3.1.2.1
"has been or can be" are to different things which also include "is/was not".
How would you (or curl) handle
Content-Type: application/x-tar
Content-Encoding: gzip
when downloading 'x.tar.gz' or 'x.tgz' ? Save the file compressed or
uncompressed ? And what if the file is (correctly) named 'x.tar' ?
I downloaded/tested thousands of web pages and they behave as if 'Content-
Encoding: gzip' is a compression for the transport. Uncompressing it 'on-the-
fly' and saving that uncompressed data was the correct behavior.
Regards, Tim
signature.asc
Description: This is a digitally signed message part.