[Bug-wget] Question - saved links with --content-disposition

bug-wget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Question - saved links with --content-disposition

From:	Harling, Thomas
Subject:	[Bug-wget] Question - saved links with --content-disposition
Date:	Sun, 6 Jul 2014 17:37:14 +0000

Hi

I'm trying to download part of a site that uses cgi scripts to serve pages but 
also has downloads such as pdfs and I want to get both with wget.

e.g. A page that links to pdfs to download is mysite.com/blah.cgi?key=1234 and 
a pdf to download is at mysite.com/blah.cgi?key=5678 but the actual pdf file is 
called awesome.pdf

If I don't use --content-disposition The main page mysite.com/blah.cgi?key=1234 
that wget downloads has a relative link /blah.cgi?key=5678 to the pdf it also 
downloads, so when I'm browsing through the files on my computer I can click 
the link and the pdf opens in my browser.

However, the pdf is named blah.cgi?key=5678 which really isn't that 
descriptive, especially as the site has a few hundred pdfs which are otherwise 
very usefully named. Using --content-disposition works in that the pdf is now 
saved as awesome.pdf, but the hyperlink in the original blah.cgi?key=1234 page 
downloaded still points to /blah.cgi?key=5678 and not /awesome.pdf, so I can't 
browse my downloaded copy of the site on my computer as before.

Is there a way to fix this, so I get both the actual filename and the correct 
link in the downloaded html page? Given the links are re-written by wget in 
saved html pages anyway, it isn't hard to imagine an if-statement which picks 
the correct filename to point to when writing the new links.

Thanks
Tom

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-wget] Question - saved links with --content-disposition, Harling, Thomas <=

Prev by Date: Re: [Bug-wget] [Bug-Wget] Misc. patches
Next by Date: Re: [Bug-wget] [Bug-Wget] Misc. patches
Previous by thread: [Bug-wget] [Bug-Wget] Misc. patches
Next by thread: [Bug-wget] download webpage help
Index(es):
- Date
- Thread