bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Difficulty downloading a site from archive.org


From: phil curb
Subject: [Bug-wget] Difficulty downloading a site from archive.org
Date: Fri, 12 Aug 2011 19:28:38 +0100 (BST)

I've been looking at downloading a site that's on archive.org

I don't have the site in 
front of me now but here are two example pages showing the kind of structure 
i'm working with.  Notice the website is spread in various directories by 
archive.org

http://web.archive.org/web/20090429823419/http://users.dickens.com/~goodrevs/help/INDEX.HTM

http://web.archive.org/web/20090421420227/http://users.dickens.com/~goodrevs/home.html

Of course I don't want to download the whole of the internet!  so wouldn't want 
to do the whole archive.org domain!

All the URLs  I want have the string http://users.dickens.com/~goodrevs/  in 
them. 

But notice that they're not all within the same directory higher up. one page 
is in 20090429823419  another is in 20090421420227

but they are all in http://users.dickens.com/~goodrevs/ within archive.org

How should I go about this, 

What are my options?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]