bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #62869] if retry hits a 302 FOUND wget forgets to send the Range he


From: Emanuel Czirai
Subject: [bug #62869] if retry hits a 302 FOUND wget forgets to send the Range header thus appending the whole file to what's downloaded alrdy
Date: Sat, 6 Aug 2022 01:49:13 -0400 (EDT)

URL:
  <https://savannah.gnu.org/bugs/?62869>

                 Summary: if retry hits a 302 FOUND wget forgets to send the
Range header thus appending the whole file to what's downloaded alrdy
                 Project: GNU Wget
               Submitter: correabuscar
               Submitted: Sat 06 Aug 2022 05:49:12 AM UTC
                Category: Program Logic
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
                 Release: trunk
         Discussion Lock: Any
        Operating System: GNU/Linux
         Reproducibility: Every Time
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: Yes


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC By: Emanuel Czirai <correabuscar>
Hello.
I've encountered this append bug on Gentoo with wget-1.21.3-r1 while portage
is downloading the file android-studio-2022.1.1.9-linux.tar.gz for Android
Studio Canary (a 1G file, which on disk was 1.6G and thus corrupt due to this
bug)

I've (not yet) attached file *problem_on_real_url.log* if you want to see wget
output the second time I've reproduced the above which yielded a file that was
24 MiB larger. I haven't redacted anything(like my IP address). I haven't
attached this yet, because only 4 files can be attached, if you really want to
see this let me know, I will attach in the next comment, but only if you need
to see it.

I couldn't reproduce it all the time because those google servers don't always
yield a 302 FOUND after a timeout and they don't always timeout either.

So I've come up with a test that always reproduces this issue (unfortunately,
I couldn't figure out how to make it a test case - test suite doesn't seem to
have the needed functionality): A server that pretends to timeout in the
middle of the transfer then when wget retries, it will give a 302 FOUND
<https://www.rfc-editor.org/rfc/rfc7231.html#section-6.4.3> and redirect to
another server and this is when wget forgets to send the Range header which
specifies from where should the server continue sending the file, thus the
server sends the full file from the beginning, and wget still acts as if the
file is being sent from the continue point, thus appending the full file to
whatever it already downloaded until the timeout(and the 302) occurred.

I've attached files:

a.py
go
tst
wget_no_append_on302_uponretry.patch


to run the test and check that the bug exists just first *chmod a+x go tst*
then run(as normal user, always):

./go

or
to see wget --debug output:

./go --debug

or

./go bug --debug



The last line should be a red color: "Bug still present!"

To see how wget acts when the server doesn't do a 302 redirect after a timeout
(ie. it never hits this bug) then run:

./go nobug --debug


This will always say as last line: "Bug is fixed."

To test both:

./tst


For this test script, if the bug is not fixed you get a yellow/brown last
line:
"ok, bug test is fine ie. wget isn't fixed (but it should eventually be, hence
why this is yellow)"

but if the bug is fixed, you get:
"Failed to reveal the bug, was the wget bug fixed?! (assume this is green if
you know that wget got fixed)"



The test wants to wget the file with contents "Hello World.\r\n" but the
server induces a timeout after "Hello " and this causes wget to retry, but the
server then gives a 302 which wget follows and then wget doesn't send a Range
header anymore causing the server to reply with 200 OK instead of 206 Partial
Content, thus the final file contents are "Hello Hello World.\r\n" when the
bug is present, thus showcasing the fact that the whole file(which is "Hello
World.\r\n") just got appended to whatever it already downloaded(which is the
first "Hello ")

Apply that attached patch to wget to see a proof of concept hacky fix which
makes wget do send a Range header after the 302 happens by pretending that
wget was ran with --start-pos=X arg, where X is the file offset it should've
continued from. It's a hack, not the actual fix.








    _______________________________________________________
File Attachments:


-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC  Name:
wget_no_append_on302_uponretry.patch  Size: 1KiB   By: correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53534>
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC  Name: go  Size: 993B   By: correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53533>
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC  Name: tst  Size: 2KiB   By:
correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53532>
-------------------------------------------------------
Date: Sat 06 Aug 2022 05:49:12 AM UTC  Name: a.py  Size: 8KiB   By:
correabuscar
test for the bug presence and hacky poc patch
<http://savannah.gnu.org/bugs/download.php?file_id=53531>

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?62869>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]