bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] download page-requisites with spanning hosts


From: Jake b
Subject: [Bug-wget] download page-requisites with spanning hosts
Date: Wed, 29 Apr 2009 18:50:11 -0500

I'm trying to download multiple pages from the sijun speedpaint thread
so I can use their images for my random desktop folder. I can download
each page by hand using firefox, but, this becomes unwieldy,
especially since prev button has bit of a delay. ( So I want to
automate it, with delays and/or speedcaps to be friendly to the server
)

The wGet command I am using:
wget.exe -p -k -w 15
"http://forums.sijun.com/viewtopic.php?t=29807&postdays=0&postorder=asc&start=27330";

It has 2 problems:

1) Rename file:

Instead of creating something like: "912.html" or "index.html" it instead
becomes: "address@hidden&postdays=0&postorder=asc&start=27330"

2) images that span hosts are failing.

I have page-resuisites on, but, since some pages are on tinypic, or
imageshack, etc.... it is not downloading them. Meaning it looks like
this:

sijun/page912.php
                imageshack.com/1.png
                tinypic.com/2.png
                randomguyshost.com/3.png


Because of this, I cannot simply list all domains to span. I don't
know all the domains, since people have personal servers.

How do I make wget download all images on the page? I don't want to
recurse other hosts, or even sijun, just download this page, and all
images needed to display it.




[ This one is a lower priority, but someone might already know how to
solve this ]
3) After this is done, I want to loop to download multiple pages. It
would be cool If I downloaded pages 900 to 912, and each pages next
link work correctly to link to the local versions.

I'm not sure if I can use wget's -k command, or, if that won't work
because of recursion on forums can be wierd?
Either way, I have a simple script that can convert 900 to 912 into
the correct URLs, and pausing in between each request.

Maybe I will have to manually modify links using regex's unless you
know a shortcut?



thanks!
--
Jake




reply via email to

[Prev in Thread] Current Thread [Next in Thread]