bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Feature request/suggestion: option to pre-allocate space for


From: markk
Subject: [Bug-wget] Feature request/suggestion: option to pre-allocate space for files
Date: Tue, 24 Jan 2012 15:22:56 -0000
User-agent: SquirrelMail/1.4.21

Hi,

This post is to suggest a new feature for wget: an option to pre-allocate
disk space for downloaded files. (Maybe have a --pre-allocate command-line
option?)

The ability to pre-allocate space for files would be useful for a couple
of reasons:

- By pre-allocating all space before downloading, the risk of exiting due
to a disk-full error is avoided. When downloading from a server which
doesn't support resuming downloads, an accidental disk full condition
means you have to re-download the whole file after freeing up some disk
space. That wastes a lot of time and network bandwidth.

- Disk fragmentation can be reduced. Downloading large files can take many
hours. While wget is downloading, much other disk activity can be caused
by other programs (web browser cache, email client etc.). The result is
the wget output file can end up unnecessarily fragmented. And likewise,
files written by other programs while wget is running end up more
fragmented.

On Linux, fallocate() and posix_fallocate() can be used to pre-allocate
space. The advantage of fallocate() is that, by using the
FALLOC_FL_KEEP_SIZE flag, space is allocated but the apparent file size is
unchanged. That means resuming with --continue works as normal.
posix_fallocate() on the other hand, sets the file length to its full
size, meaning that --continue won't work unless there were some way to
specify the byte offset that wget should continue from.

The fallocate program (see "man 1 fallocate") can be used to manually
pre-allocate space. For a single file that's a slight hassle but simple
enough. (Run wget to determine file length, break, use fallocate to
allocate space, then re-run wget.) But when using wget to download many
files in one session it's not really practical.

Of course, if the web server does not report the file size, it won't be
possible to pre-allocate space. Or would it...? Suppose the user is
downloading some CD ISO images from a server which does not report file
lengths. If the user could tell wget to pre-allocate 800MB for each file,
and then have wget call ftruncate() when each file has finished
downloading, that should achieve a result almost as good as if the server
did report file lengths.


-- Mark





reply via email to

[Prev in Thread] Current Thread [Next in Thread]