bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Timestamp behaviour with modified local files


From: Ian Wienand
Subject: [Bug-wget] Timestamp behaviour with modified local files
Date: Tue, 28 Jul 2015 14:10:20 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

Hi,

The manual says

  "If the local file does not exist, or the sizes of the files do not
   match, Wget will download the remote file no matter what the
   time-stamps say."

In two cases I'm not seeing this:

1) With if-modified-since I don't believe the content-length is
   checked

2) Without if-modified-since, if the remote end returns a 416

Here's a quick example

$ ./wget http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz
$ truncate -s 10M cirros-0.3.4-x86_64-uec.tar.gz # modify the file size

So firstly, when using current git, we see the "If-Modified-Since"
request sent, but I guess the server does not look at "Range" because
it just returns 304, despite us asking for bytes the file doesn't
have.  wget doesn't notice that the local file is a different size.

---
$ ./wget --debug --timestamping -c  
http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz
  Setting --timestamping (timestamping) to 1
  Setting --continue (continue) to 1
  DEBUG output created by Wget 1.16.3.90-4e56a on linux-gnu.

  URI encoding = ‘UTF-8’
  --2015-07-28 13:00:28--  
http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz
  Resolving download.cirros-cloud.net (download.cirros-cloud.net)... 
69.163.241.114
  Caching download.cirros-cloud.net => 69.163.241.114
  Connecting to download.cirros-cloud.net 
(download.cirros-cloud.net)|69.163.241.114|:80... connected.
  Created socket 4.
  Releasing 0x00000000014dc720 (new refcount 1).

  ---request begin---
  GET /0.3.4/cirros-0.3.4-x86_64-uec.tar.gz HTTP/1.1
  If-Modified-Since: Tue, 28 Jul 2015 03:00:24 GMT
  Range: bytes=10485760-
  User-Agent: Wget/1.16.3.90-4e56a (linux-gnu)
  Accept: */*
  Accept-Encoding: identity
  Host: download.cirros-cloud.net
  Connection: Keep-Alive

  ---request end---
  HTTP request sent, awaiting response... 
  ---response begin---
  HTTP/1.1 304 Not Modified
  Date: Tue, 28 Jul 2015 03:00:30 GMT
  Server: Apache
  Connection: Keep-Alive
  Keep-Alive: timeout=2, max=100
  ETag: "848176-51580ae5ed140"

  ---response end---
  304 Not Modified
  Registered socket 4 for persistent reuse.
  File ‘cirros-0.3.4-x86_64-uec.tar.gz’ not modified on server. Omitting 
download.
---

Using --no-if-modified-since, we see the server does notice the range
and returns a 416 (Range Not Satisfiable).

---
$ ./wget --debug --no-if-modified-since --timestamping -c  
http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz
  Setting --timestamping (timestamping) to 1
  Setting --continue (continue) to 1
  DEBUG output created by Wget 1.16.3.90-4e56a on linux-gnu.

  URI encoding = ‘UTF-8’
  --2015-07-28 13:00:41--  
http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-uec.tar.gz
  Resolving download.cirros-cloud.net (download.cirros-cloud.net)... 
69.163.241.114
  Caching download.cirros-cloud.net => 69.163.241.114
  Connecting to download.cirros-cloud.net 
(download.cirros-cloud.net)|69.163.241.114|:80... connected.
  Created socket 4.
  Releasing 0x0000000000fbc6c0 (new refcount 1).

  ---request begin---
  HEAD /0.3.4/cirros-0.3.4-x86_64-uec.tar.gz HTTP/1.1
  Range: bytes=10485760-
  User-Agent: Wget/1.16.3.90-4e56a (linux-gnu)
  Accept: */*
  Accept-Encoding: identity
  Host: download.cirros-cloud.net
  Connection: Keep-Alive

  ---request end---
  HTTP request sent, awaiting response... 
  ---response begin---
  HTTP/1.1 416 Requested Range Not Satisfiable
  Date: Tue, 28 Jul 2015 03:00:41 GMT
  Server: Apache
  Vary: Accept-Encoding
  Keep-Alive: timeout=2, max=100
  Connection: Keep-Alive
  Content-Type: text/html; charset=iso-8859-1

  ---response end---
  416 Requested Range Not Satisfiable
  Registered socket 4 for persistent reuse.
  URI content encoding = ‘iso-8859-1’

      The file is already fully retrieved; nothing to do.
---

I feel like wget should take this as an indication there is a
difference in file-size and trigger a re-download as documented.  I
think this happens because I made the local file *larger*, a path that
probably isn't often taken.

Am I correct in thinking these are bugs, not features?

Thanks,

-i



reply via email to

[Prev in Thread] Current Thread [Next in Thread]