bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip


From: Tim Rühsen
Subject: Re: [Bug-wget] New wget (1.19.2): Unexpected download behaviour for gzip-compressed tarballs (HTTP-header dependent)
Date: Thu, 02 Nov 2017 20:57:28 +0100
User-agent: KMail/5.2.3 (Linux/4.13.0-1-amd64; KDE/5.37.0; x86_64; ; )

On Mittwoch, 1. November 2017 22:21:38 CET Daniel Stenberg wrote:
> On Wed, 1 Nov 2017, Tim Rühsen wrote:
> > Content-Encoding: gzip means that the data has been compressed for
> > transportation purposes only.
> 
> That's actually not what it means. There's transfer-encoding for that
> purpose, but that's not generally supported by clients.

I didn't want to over-complicate things. What I indeed didn't remember was 
that Transfer-Encoding allows 'gzip' (even in combination with chunked):
https://tools.ietf.org/html/rfc7230#section-3.3.1

> RFC7231 section 3.1.2.1 [*] says this:
> 
>     Content coding values indicate an encoding transformation that has
>     been or can be applied to a representation.
> 
> [*] = https://tools.ietf.org/html/rfc7231#section-3.1.2.1

"has been or can be" are to different things which also include "is/was not".
How would you (or curl) handle
  Content-Type: application/x-tar
  Content-Encoding: gzip
when downloading 'x.tar.gz' or 'x.tgz' ? Save the file compressed or 
uncompressed ? And what if the file is (correctly) named 'x.tar' ?

I downloaded/tested thousands of web pages and they behave as if 'Content-
Encoding: gzip' is a compression for the transport. Uncompressing it 'on-the-
fly' and saving that uncompressed data was the correct behavior.

Regards, Tim

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]