bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Multi segment download


From: Darshit Shah
Subject: Re: [Bug-wget] Multi segment download
Date: Wed, 9 Sep 2015 08:55:37 +0530

Thanking You,
Darshit Shah
Sent from mobile device. Please excuse my brevity
On 09-Sep-2015 8:50 am, "Hubert Tarasiuk" <address@hidden> wrote:
>
> On Sat, Aug 29, 2015 at 12:50 AM, Darshit Shah <address@hidden> wrote:
> > Thanking You,
> > Darshit Shah
> > Sent from mobile device. Please excuse my brevity
> > On 29-Aug-2015 1:13 pm, "Tim Rühsen" <address@hidden> wrote:
> >>
> >> Hi,
> >>
> >> normally it makes much more sense when having several download mirrors
and
> >> checksums for each chunk. The perfect technique for such is called
> > 'Metalink'
> >> (more on www.metalinker.org).
> >> Wget has it in branch 'master'. A GSOC project of Hubert Tarasiuk.
> >>
> > Sometimes the evil ISPs enforce a per connection bandwidth limit. In
such a
> > case, multi segment downloads from a single server do make sense.
> >
> > Since metalink already has support for downloading a file over multiple
> > connections, it should not be too difficult to reuse the code for use
> > outside of metalink.
> The current Metalink impl in Wget will not download from multiple
> mirrors simultaneously since Wget itself is single-threaded.
> Adding optional (POSIX) threads support to Wget (especially for the
> Metalinks) could be perhaps worth discussion.
> For now the solution might be to start multiple Wget instances using
> the --start-pos option and somehow limit the length of download (I am
> not sure if Wget currently has an option to do that).
>
There isn't. And a trivial implementation of the same didn't work. When I
tried it, Wget kept retrying the download on a 206 response since it
believed the server misunderstood the request. It will take a little more
effort to implement this.
> >
> > I think it would be a good idea to do so. I'm not sure if all the
possible
> > variations of the range headers are parsed by Wget.
> >> Additionally, Wget2 is under development, already having the option
> > --chunk-
> >> size (e.g. --chunk-size=1M) to start a multi-threaded download of a
file.
> >>
> >> Regards, Tim
> >>
> >>
> >> Am Freitag, 28. August 2015, 15:41:27 schrieb Random Coder:
> >> > On Fri, Aug 28, 2015 at 3:06 PM, Ander Juaristi <address@hidden>
> > wrote:
> >> > > Hi,
> >> > >
> >> > > Would you point us to some potential use cases? How would a Wget
user
> >> > > benefit from such a feature? One of the best regarded feature of
> > download
> >> > > managers is the ability to resume paused downloads, and that's
already
> >> > > supported by Wget. Apart from that, I can't come across any other
use
> >> > > case. But that's me, maybe you have a broader overview.
> >> > One possible feature, described in flowery language from a product
> >> > description: "... splits files into several sections and downloads
> >> > them simultaneously, allowing you to use any type of connection at
the
> >> > maximum available speed. With FDM download speed increases, or even
> >> > more!"
> >> >
> >> > And, just show this can help, at least in some situations, here's an
> >> > example using curl (sorry, I don't know how to do a similar request
in
> >> > wget).  First a normal download of the file:
> >> >
> >> > curl -o all http://mirror.internode.on.net/pub/test/100meg.test
> >> >
> >> > This command takes an average of 48.9 seconds to run on my current
> >> > network connection.  Now, if I split up the download as the download
> >> > manager will, and run these four commands at the same instant:
> >> >
> >> > curl -o part1 -r0-25000000
> >> > http://mirror.internode.on.net/pub/test/100meg.test curl -o part2
> >> > -r25000001-50000000
> >> > http://mirror.internode.on.net/pub/test/100meg.test
> >> > curl -o part3 -r50000001-75000000
> >> > http://mirror.internode.on.net/pub/test/100meg.test
> >> > curl -o part4 -r75000001-
> >> > http://mirror.internode.on.net/pub/test/100meg.test
> >> >
> >> > The union of time it takes all four commands to run ends up being an
> >> > average of 19.9 seconds over a few test runs on the same connection.
> >> > There's some penalty here because I need to spend time combining the
> >> > files afterwards, but if the command supported this logic internally,
> >> > no doubt much of that work could be done up front as the file is
> >> > downloaded.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]