|
From: | Ángel González |
Subject: | Re: [Bug-wget] Does wget check if specified user agent is allowed in robots.txt? |
Date: | Sun, 29 Jun 2014 22:03:28 +0200 |
User-agent: | Thunderbird |
On 21/06/14 21:59, György Chityil wrote:
Thanks for your kind reply. My line of thinking was that a robot client should obey the robots.txt of a site regardless of single or multiple requests. Even if it shouldn't as you say, it could be great to have a separate flag for it instead of having it buried in the recursive option.
What is your rationale for having wget refuse to download the single url you requested? I can't think on a case where you "want this url" and at the same time don't want it
if it shouldn't be crawled (eg. a calendar).
[Prev in Thread] | Current Thread | [Next in Thread] |