bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] [bug #48232] Sometimes wget restarts download from the beginn


From: Evgeny Kapun
Subject: [Bug-wget] [bug #48232] Sometimes wget restarts download from the beginning, even if the server supports resumed downloads
Date: Wed, 15 Jun 2016 15:06:07 +0000 (UTC)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0

URL:
  <http://savannah.gnu.org/bugs/?48232>

                 Summary: Sometimes wget restarts download from the beginning,
even if the server supports resumed downloads
                 Project: GNU Wget
            Submitted by: abacabadabacaba
            Submitted on: Wed 15 Jun 2016 06:06:05 PM MSK
                Category: Program Logic
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 1.18
        Operating System: GNU/Linux
         Reproducibility: Every Time
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: None

    _______________________________________________________

Details:

If the connection is interrupted during download, wget normally tries to
continue the download from the same place where it left off. This only works
if the server supports resumed downloads, otherwise, download restarts from
the beginning. However, sometimes wget would restart the download even when
the server does support resumption. Testing shows that this happens if a
network error occurs before wget receives HTTP response from the server, a
situation which is quite common on poor networks.

For testing, I created a web server which would behave as follows:
* On the first request, it will send a response with `Content-Length: 1000`
and 500 bytes of data, then wait.
* On all other requests, it will just wait without sending any data.

Using wget to download from such server produces this:

$ wget --debug --timeout 1 --tries 4 'http://[::1]:8888/test'
Setting --timeout (timeout) to 1
Setting --tries (tries) to 4
DEBUG output created by Wget 1.18 on linux-gnu.

Reading HSTS entries from $HOME/.wget-hsts
URI encoding = 'ANSI_X3.4-1968'
converted 'http://[::1]:8888/test' (ANSI_X3.4-1968) ->
'http://[::1]:8888/test' (UTF-8)
Converted file name 'test' (UTF-8) -> 'test' (ANSI_X3.4-1968)
--2016-06-15 17:42:36--  http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.

---request begin---
GET /test HTTP/1.1
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.0 200 OK
Content-Length: 1000

---response end---
200 OK
Registered socket 4 for persistent reuse.
Length: 1000
Saving to: 'test'

test                 50%[=========>          ]     500  --.-KB/s    in 1.0s   


Disabling further reuse of socket 4.
Closed fd 4
2016-06-15 17:42:37 (500 B/s) - Read error at byte 500/1000 (Connection timed
out). Retrying.

--2016-06-15 17:42:38--  (try: 2)  http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.

---request begin---
GET /test HTTP/1.1
Range: bytes=500-
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... Read error (Connection timed out) in
headers.
Closed fd 4
Retrying.

--2016-06-15 17:42:41--  (try: 3)  http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.

---request begin---
GET /test HTTP/1.1
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... Read error (Connection timed out) in
headers.
Closed fd 4
Retrying.

--2016-06-15 17:42:45--  (try: 4)  http://[::1]:8888/test
Connecting to [::1]:8888... connected.
Created socket 4.
Releasing 0x000055593fd586f0 (new refcount 0).
Deleting unused 0x000055593fd586f0.

---request begin---
GET /test HTTP/1.1
User-Agent: Wget/1.18 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: [::1]:8888
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... Read error (Connection timed out) in
headers.
Closed fd 4
Giving up.



As you may see, only the second request includes `Range` header. Starting from
the third request, `Range` header is not included, so the download would not
be resumed at this point. In practice, this means that a big download would
suddenly restart because the network was down for some time, which is
undesirable.

I attached a test program to reproduce the issue. It listens on [::1]:8888 and
acts as a web server. You need to restart it before every test.

Related bugs:
* #31653: a fix for that bug is probably what introduced this bug. Read the
discussion there.
* #48123: a bug similar to this one, but there it is not clear that the server
supports resumed downloads.



    _______________________________________________________

File Attachments:


-------------------------------------------------------
Date: Wed 15 Jun 2016 06:06:05 PM MSK  Name: wget-test  Size: 597B   By:
abacabadabacaba

<http://savannah.gnu.org/bugs/download.php?file_id=37488>

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?48232>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]