bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Help request: Limit recursion, but unconditionally includ


From: Tim Ruehsen
Subject: Re: [Bug-wget] Help request: Limit recursion, but unconditionally include all media files
Date: Tue, 22 Oct 2013 09:26:32 +0200
User-agent: KMail/4.10.5 (Linux/3.10-3-amd64; KDE/4.10.5; x86_64; ; )

On Monday 21 October 2013 12:33:10 Alexander Tobias Heinrich wrote:
> For example, I tried:
> wget --tries=3 --retry-connrefused --no-clobber --load-cookies=cookies.txt
> --convert-links --page-requisites --adjust-extension --recursive
> --include-directories /strategy/live-poker,/download
> http://www.pokerstrategy.com/strategy/live-poker
> 
> This correctly downloads only the html documents I want and also gets the
> media files from the /download folder, but:
> - does not modify the html so that <img>-Tags point to the downloaded files
> (however, it does modify <a>-Tags that link to local html documents)
> - does not get media files from other domains.
> 
> If for example I add --span-hosts, it simply gets too much (all documents
> from different language versions of the website that I don't need).
> 
> Note: For the example URL I provided here you won't need to log in and thus
> the  load-cookies option can be waived.

Hi Alexander,

please have a look into the 'Recursive Accept/Reject Options' docs.

You could set the domains to be followed by using --domains.
Also --include-directories and/or --exclude-directories might be a help.

I am not sure that you can achieve your goal with a single call to Wget.
Missing files / directories could be downloaded using separate calls to Wget.
--input-file combined with --force-html and/or --base might be a help.

Regards, Tim




reply via email to

[Prev in Thread] Current Thread [Next in Thread]