[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] Handling query-strings and over-long URI's
From: |
David A. Cobb |
Subject: |
[Bug-wget] Handling query-strings and over-long URI's |
Date: |
Mon, 19 Aug 2013 11:55:39 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Thunderbird/25.0a1 |
When I try "harvesting" many sites, I see wget trying URI's like:
[1]http://the.target.host/some/subdirectory/index.html?source=navbar,se
arch=search%20term
And some that contain complex query strings making a URI that is way
over the maximum path length.
From Google, I see URI's where '@' is used rather than '?'.
In both cases, the content returned is dynamic, dependent on the
query.
IMNSHO, wget should simply leave such links in their original page
source and not try to retrieve them at all. I think that should be the
default case, but it would be OK as an option.
References
1.
http://the.target.host/some/subdirectory/index.html?source=navbar,search=search%20term
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug-wget] Handling query-strings and over-long URI's,
David A. Cobb <=