bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Enqueue logic problems


From: Darshit Shah
Subject: Re: [Bug-wget] Enqueue logic problems
Date: Thu, 2 May 2013 17:22:03 +0530

I should have been more clear. --span-hosts will enqueue the other files,
but it will also enqueue files from other hosts. I wish to recursively
download a website but not other sites that it links to.

Of course I could add --accept-regex / --reject-regex options to prevent
wget from wandering onto other hosts. But shouldn't the default --recursive
option simply handle cases where a www is either added or removed? Or is
there any scenario that I am missing which would cause undesirable effects
here?

On Thu, May 2, 2013 at 5:22 PM, Giuseppe Scrivano <address@hidden> wrote:

> Darshit Shah <address@hidden> writes:
>
> > When using the --recursive command with wget, there seems to be a small
> > issue with the logic that decides whether to enqueue a file to the
> > downloads list or not.
> >
> > By default wget downloads files only from the same host. However, this
> > causes a problem when the target hostname changes thus:
> > parent: gnu.org
> > target: www.gnu.org
> >
> > This issue causes wget to stop after just one download on a lot of sites.
> > I'm not sure if this exists in the older or release since I only have the
> > development version installed.
>
> does --span-hosts fix this scenario for you?
>
> Cheers,
> Giuseppe
>



-- 
Thanking You,
Darshit Shah
Research Lead, Code Innovation
Kill Code Phobia.
B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani


reply via email to

[Prev in Thread] Current Thread [Next in Thread]