[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Does wget check if specified user agent is allowed in rob
Re: [Bug-wget] Does wget check if specified user agent is allowed in robots.txt?
Sat, 21 Jun 2014 21:59:43 +0200
Thanks for your kind reply.
My line of thinking was that a robot client should obey the robots.txt of a
site regardless of single or multiple requests. Even if it shouldn't as you
say, it could be great to have a separate flag for it instead of having it
buried in the recursive option.
On Sat, Jun 21, 2014 at 9:44 PM, Darshit Shah <address@hidden> wrote:
> On Sun, Jun 22, 2014 at 1:10 AM, György Chityil
> <address@hidden> wrote:
> > Thank you so much! This is a perfect reply. Couldn't have asked for more.
> > While not a bug, an additional idea came to my mind while reading your
> > reply. If this robots checking feature will be fixed, it would be great
> > be able to enable robots checking for simple, one off requests as w
> That does not make sense. The robots.txt file is expected to be
> checked and adhered to by robots / spiders trying to crawl the
> website, not by clients trying to access a single / few pages. Wget
> should NOT check the robots file for single downloads. It should be
> following the rules in robots.txt only when acting like a spider, i.e.
> when in recursive mode.
> Thanking You,
> Darshit Shah
274 44 98
06 30 5888 744