bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #66468] wget --no-clobber sometimes overwrites existing files


From: anonymous
Subject: [bug #66468] wget --no-clobber sometimes overwrites existing files
Date: Thu, 28 Nov 2024 05:09:57 -0500 (EST)

Follow-up Comment #1, bug #66468 (group wget):

No matter what I tried, I was unable to prevent wget from erasing at least
some of my existing 20,000 files while --no-clobber was being used.  Wget
overwrites about 5% of them.

I ended up working around the bug by writing my own URL looping routine in
Bash that loads the URLs from file into an array, then loops over the URLs,
grabs the last part of the URL to use as the output filename, skips output
files which already exist with data in them.

The downside to doing it this way is wget is reopened 20,000 times, and
connects to the remote server 20,000 times because a Keep-Alive connection
cannot be used.  It's slower and puts more strain on the remote server, but it
never erases local files.

I think silently erasing possibly irreplaceable local data files when being
_explicitly told not to touch them_ is a really major bug that should be
addressed!

A simple example of where this can cause mass destruction is someone archiving
pages on a web site as soon as new pages appear, but never wanting any future
updates to those pages to show up (including the page being removed).  As it
is, wget can randomly and silently erase these locally archived files despite
using --no-clobber.

Very dangerous.


    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?66468>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]