wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Minor usage issues/suggestions


From: Tim Rühsen
Subject: Re: Minor usage issues/suggestions
Date: Fri, 16 Jun 2023 15:13:33 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0

Hey Mitch,

thank you very much for bringing up these issues.
And also thank you for your contributions on Github, just had a (very quick) look.

Would it be any problem for you to sign the FSF Copyright assignment for wget2 ? The background is that the copyright to all the source code contributions for wget2 are assigned to the FSF. While I can accept "trivial" contributions without the paperwork, your work is beyond that :)
We can switch to private email to talk about the details.

Regarding the mentioned issues here, I'd say you hit the nail and we have to or want to settle those issues.

Personally, I am most concerned about the deadlock. Any help from you as a native Windows user is highly appreciated. There might be an issue in the gnulib thread function wrappers, as I don't get deadlocks on Linux.

My time to code on OSS is pretty limited since ~2 years, so any fine-grained issues (best on Gitlab) would help to fix issues in small iterations (in the sense of "better open many, concrete issues than one generic that covers all").

Will come back to this email later and will also review your contributions on GH - have to run to an appointment now.

Tim (he/him)

On 6/14/23 13:24, Mitch Capper wrote:
Wasn't sure of a great place to put this there wasn't a discussions option
on github/gitlab.


Some of these I can turn into bug reports (at least once some of the other
windows issues that may make reproing them are patched in).  I also didn't
want to just create several bug reports for something more discussion based.

**Progress Bar**
- Progress bar can be an issue when the file has an unknown size as the
total slots size is used.
- If you have a redirect to a file the redirect request bytes are counted
in.

**Chunked download items** (I know
https://gitlab.com/gnuwget/wget2/-/issues/195 highlights some of the
issues):
- chunked downloads don't show the total size only the size of what has
been downloaded + the size of the current chunk.  If you have a 10meg file
with 1 meg chunks it shows the percent for 0-10MB,  then goes to 0-20MB
(being half done) etc etc.
- The chunked description says "--chunk-size=size Download large files in
multithreaded chunks.".  Which I mean maybe technically true, but as they
are downloaded in serial I don't think that is what one would expect.
- Chunks count as multiple files for the progress bar meaning it no longer
shows the filename and confusingly shows files X with whatever the count
is.  In addition failing chunks seems to just increment the file counter.
- Despite success in downloading chunks it seems to count them as errors:
```
wget2-debug --chunk-size=1M "
http://localhost:1000/tester.bin?size=4M&speed=0.1M";
4 files              100%
[=======================================================================>]
    4.00M  105.04KB/s
                           [Files: 1  Bytes: 4.00M [103.29KB/s] Redirects: 0
  Todo: 0  Errors: 4    ]
```

**Ranges** - https://gitlab.com/gnuwget/wget2/-/issues/626 covers some
Does not check the Accept-Ranges header is sent from the web server
Does not look for the Content-Range response
Does not look for the 206 partial content status on responses
If the entire file is sent down for the chunk it is ignored, even though
wget now has everything.

**Stats/Logging/errors and exit codes**

First turning on "full" debugging is a bit tricky if you want to dump all
stats/info it is something along the line of:
`-v -d --stats-site=- --stats-tls=- --stats-ocsp=- --stats-server=-
--stats-dns=-`
oh, and don't forget the fact that --progress=bar (which is auto detected
to on) silently disables all info level logging so truly:
`-v -d --stats-site=- --stats-tls=- --stats-ocsp=- --stats-server=-
--stats-dns=- --progress=none` .   One of the commits I have does warn
against the progress bar logging issue if debug is turned on.
There was --stats-all it looks like prior, but I am guessing it was removed
due to logging thread collisions if the same filename was used.

I think one of the biggest issues is the progress bar disables all standard
info level logging.  Depending on the download issues determining that an
error occurred is quite difficult.   For example with bar off:

```
wget2-official.exe https://httpstat.us/401 --progress=none
[0] Downloading 'https://httpstat.us/401' ...
HTTP ERROR response 401 Unauthorized [https://httpstat.us/401]
```
with default (bar on):
```
wget2-official.exe  https://httpstat.us/401
0 files              100% [====================>]      32     --.-KB/s
                           [Files: 0  Bytes: 0  []
```

there is no error output, you have to look and see that 0 files
downloaded.  If you have several files downloading it once it gets more
confusing as now you need the total number of files you were downloading
and comparing that to the count.  Good luck figuring out which failed.

I would assume this is a bug, but at least some failures to fetch actually
don't set the exit code correctly:
```
wget2-official.exe  https://httpstat.us/400 ; echo "Last Exit:
$LASTEXITCODE"
0 files              100% [====================>]      15     --.-KB/s
                           [Files: 0  Bytes: 15  ]
Last Exit: 0
wget2-official.exe --progress=none  https://httpstat.us/400 ; echo "Last
Exit: $LASTEXITCODE"
[0] Downloading 'https://httpstat.us/400' ...
HTTP ERROR response 400 Bad Request [https://httpstat.us/400]
Last Exit: 0
vs working error:
wget2-official.exe  https://httpstat.us/401 ; echo "Last Exit:
$LASTEXITCODE"
0 files              100% [====================>]      32     --.-KB/s
                           [Files: 0  Bytes: 0  []
Last Exit: 6
```

Part of the problem is the excessive truncation for the full status line
under the progress bar:
`[Files: 1  Bytes: 4.00M [103.29KB/s] Redirects: 0  Todo: 0  Errors: 4    ]`
it seems to require min padding on each side to look "centered" but that
means even in a console where it could show far more it truncates ie:
```
4 files               88% [===================================>     ]
  3.53M  102.74KB/s
                           [Files: 1  Bytes: 3.00M [103.46KB/s] Redir]
```
That is a 90 char wide console, enough for the full bar but it is truncated
to 43 characters.  At a minimum it may be worth reordering the field
orders, Errors are probably pretty high up on most peoples priority list
yet appear last (especially given the lack of info otherwise).


Finally, there seems to be a deadlock condition that can happen (at least
in windows) with multi-downloads.   Sometimes it can occur while just two
downloads are happening of similarly large files (where both will basically
halt). It seems far easier to reproduce though if you download some large
files and smaller files at the same time ie:
wget2-debug -O /dev/null "http://localhost:8123/tester.bin?size=1.1G";  "
http://localhost:8123/tester.bin?size=1.0G";  "
http://localhost:8123/tester.bin?size=1.2G";  "
http://localhost:8123/tester.bin?size=50M";  "
http://localhost:8123/tester.bin?size=1M"; "
http://localhost:8123/tester.bin?size=2M"; --max-threads 15
can reproduce it nearly every time without issue.  While it may be related
to the poll code, it happened prior to my poll changes (and it would seem
odd a file ending would cause an issue).   The downloads do eventually
resume when it times out but that can be a bit.

~mitch (they, them)

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]