bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [Bug-Wget] Issues with Metalink support


From: Ángel González
Subject: Re: [Bug-wget] [Bug-Wget] Issues with Metalink support
Date: Sun, 06 Apr 2014 21:41:54 +0200
User-agent: Thunderbird

On 06/04/14 01:09, L Walsh wrote:
Sorry for the long delay answering this but I thought
I would mention a specific reason that such is done
on windows (that may apply to linux in various degrees
depending on filesystem type used and file-system activity).

To answer the question, there is a reason, but
its importance would be specific to each user's use case.

It is consistent with how some files from the internet are
downloaded, copied or extracted on windows.

I.e. IE will download things to a tmp dir (usually
under the user's home dir on windows), then
move it into place when it is done.  This prevents partly
transfered files from appearing in the destination.

Downloading this way can, also, *allow* for allocating
sufficient contiguous space at the destination in 1
allocation, and then copying the file
into place -- this allows for less fragmentation at the
final destination.  This is more true with larger
files and slower downloads that might stretch over several
or more minutes.  Other activity on the disk
is likely and if writes occur, they might happen in the
middle of where the downloaded file _could_ have had
contiguous space.

So putting a file that is likely to be fragmented as it
is downloaded due to other processes running, into
a 'tmp' location, can allow for knowing the full size
and allocating the full amount for the file so it can
be contiguous on disk.
If %TEMP% is in the same drive as the final folder, you still
have fragmentation.

It can't allocate the full amount for the file at
the destination until it has the whole thing locally, since
if the download is interrupted, the destination would contain
a file that looks to be the right size, but would have
an incomplete download in it.
It's possible -with some FS- with the Linux-specific fallocate() syscall,
but that's hardly portable :)
From Vista onwards, SetFileInformationByHandle(*FILE_ALLOCATION_INFO* <http://msdn.microsoft.com/en-us/library/windows/desktop/aa364214%28v=vs.85%29.aspx>)
seems able to also do that.
I would make it fail gracefully for the EOLed versions, but seems perfectly
fine to use.

Anyway -- the behavior of copying it to a tmp is a useful
feature to have -- IF you have the space.  It would be
a "nice" (not required) feature if there was an option on
how to do this (i.e. store file directly on download, or
use a tmpdir and then move (or copy) the file into the
final location.

Always going direct is safest if user is tight on diskspace,
but has the deficit of often causing more disk fragmentation.
Not if you do something like calling posix_fallocate(2)
(but it does change the file size)

(FWIW, I don't really care one way or the other, but wanted
to tell you why it might be useful)...

Cheers!
Linda

If you don't want to download with the final filename, I vote for
downloading at the same folder with another extension and
renaming.

I don't think wget should care about fragmentation, though.

Looking a bit the available options, and trying to get the best from
both sides, I think we should download with the file in place, trying
to preallocate the blocks (fallocate, SetFileInformationByHandle)
when possible, but not worrying too much if it can't.

Cheers



reply via email to

[Prev in Thread] Current Thread [Next in Thread]