bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] limit download size -- 201901233


From: Yousong Zhou
Subject: Re: [Bug-wget] limit download size -- 201901233
Date: Thu, 24 Jan 2019 12:32:39 +0800

On Thu, 24 Jan 2019 at 12:11, <address@hidden> wrote:
>
> ----- Yousong Zhou <address@hidden> wrote :
> > On Thu, 24 Jan 2019 at 02:32, Tim Rühsen <address@hidden> wrote:
> > >
> > > On 23.01.19 03:47, address@hidden wrote:
> > > > Hi,
> > > >   acording to
> > > >     $wget --help
> > > >   i should send reports and suggestions to this address, so i hope i'm 
> > > > doing right here.
> > > >
> > > >    the version of my distribution, given by the above command, is "GNU 
> > > > Wget 1.18"
> > > >
> > > >    and i don't seem to see an option to limit the retrieval to a 
> > > > certain amount of data or a range.
> > > >    is it possible?
> > > >
> > > > thanks in advance and happy new year,
> > > >
> > > > Zui
> > > > 201901233
> > > >
> > >
> > > You could set the Range HTTP header - many servers support it.
> > >
> > > Like
> > >
> > > wget --header "Range: bytes=0-10000" https://www.example.com/filename
> > >
> > > Regards, Tim
> > >
> >
> > At least for wget 1.19.1, it will ignore 206 "Partial Content", unless
> > we need to make it think it's continuing previous partial download.
> > Specifying Range header is not an reliable option in this regard
> >
> >     echo -n aaa >b
> >     wget -c -O b --header "Range: 3-1000" URL
> >
> >                 yousong
> Thank you both for your input...
>   and, as yousong wrote the Range header is not handled correctly by wget 
> (removing boring parts) :
>     $ wget --header "Range: bytes=500-1000" https://free.fr
>       --2019-01-24 02:22:25--  https://server.dom/
>       Resolving server.dom (server.dom)... <addr>
>       Connecting to server.dom (server.dom)... <addr>  connected.
>       HTTP request sent, awaiting response... 206 Partial Content
>       Retrying.
>
>       --2019-01-24 02:22:26--  (try: 2)  https://server.dom/
>       Connecting to server.dom (server.dom)... <addr>  connected.
>       HTTP request sent, awaiting response... 206 Partial Content
>       Retrying.
>
>       <...loop af retries...>
>
>   but curl is not exempt of problems as in (both cases bring the whole thing):
>       $ curl   https://ddg.gg > a
>         % Total    % Received % Xferd  Average Speed   Time    Time     Time  
> Current
>                                       Dload  Upload   Total   Spent    Left  
> Speed
>       100   178  100   178    0     0    310      0 --:--:-- --:--:-- 
> --:--:--   370
>       $ curl --header "Range: bytes=10-40"  https://ddg.gg > a
>         % Total    % Received % Xferd  Average Speed   Time    Time     Time  
> Current
>                                       Dload  Upload   Total   Spent    Left  
> Speed
>       100   178  100   178    0     0    314      0 --:--:-- --:--:-- 
> --:--:--   376
>

curl has --range specifically for this.

>   as for using "| head -c (end-start)" as you apply in mget, doesn' it 
> actually generate more traffic
>   than the expected (end-start) nimber of bytes?
>   (i mean, since the download goes systematically till the end, if i am 
> correct)
>
> zui
> 201901244

when head quit, wget writing to stdout will receive SIGPIPE and is
expected to quit.  It's likely that buffering in wget may cause excess
traffic be transferred on wire but I think the amount should be
neglectable.

                yousong



reply via email to

[Prev in Thread] Current Thread [Next in Thread]