bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Handling query-strings and over-long URI's


From: David A. Cobb
Subject: [Bug-wget] Handling query-strings and over-long URI's
Date: Mon, 19 Aug 2013 11:55:39 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Thunderbird/25.0a1

   When I try "harvesting" many sites, I see wget trying URI's like:
   [1]http://the.target.host/some/subdirectory/index.html?source=navbar,se
   arch=search%20term
   And some that contain complex query strings making a URI that is way
   over the maximum path length.
   From Google, I see URI's where '@' is used rather than '?'.
   In both cases, the content returned is dynamic, dependent on the
   query.
   IMNSHO, wget should simply leave such links in their original page
   source and not try to retrieve them at all.  I think that should be the
   default case, but it would be OK as an option.

References

   1. 
http://the.target.host/some/subdirectory/index.html?source=navbar,search=search%20term


reply via email to

[Prev in Thread] Current Thread [Next in Thread]