bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Recursive download: Page requirements when spanning hosts?


From: bakopper
Subject: [Bug-wget] Recursive download: Page requirements when spanning hosts?
Date: Fri, 23 Oct 2009 14:10:59 +0200 (CEST)
User-agent: SquirrelMail/1.4.15

I want to download a few sites, and have some
questions about the best way to do it...

I'll be doing a recursive-download to infinity,
but limited to the current directory downwards
(-np No Parent).  I'll also download the page
requirements (-p).

wget -r -l inf -np -p http://domain.name/index.html
(I'll also be adding to limit-rate and a bit of
pause between each download.)

My problem is that I want to have all page requirements,
also if they span hosts, or are located above/parallell
to the site (directory) I'll be working in (it's on
GeoCities, so there are many "parallell" sites).  As long
as it's "part of" a page in the directory I'm working in it
should be downloaded, but not else.  An additional problem,
is that there may be lists of links that actually points to
those directories/hosts; but nothing should be downloaded
unless it's part of a page.

Would this be possible (at least partially, I understand
if it's a problem getting around the no-parent)?

-- 
Do not do today, what you can
get others to do tomorrow.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]