[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] robots.txt not working

From: Micah Cowan
Subject: Re: [Bug-wget] robots.txt not working
Date: Fri, 16 Mar 2012 23:38:37 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120302 Thunderbird/11.0

I think you're misunderstanding what was supposed to happen.

The robots.txt file is only followed for links that wget is
automatically following. This means (a) wget has to be in
recursive-descent mode (-r or -m), and (b) it only applies to links that
weren't explicitly requested by the user. In other words, it applies
only to links that wget is actually robotting on.

Hope that helps.


On 03/16/2012 01:04 PM, phil curb wrote:
> I just tried creating a web server locally.
> |I tried creating a web server locally  putting robots.txt in there and using 
> wget  and it didn't work
> http://pastebin.com/raw.php?i=kt1mV2af 
> C:\r>wget
> ....
> 2012-03-16 19:45:32 (20.0 KB/s) - `index.html' saved [3/3] C:\r>wget 
> ....
> 2012-03-16 19:45:43 (175 KB/s) - `robots.txt' saved [26/26] C:\r>type 
> robots.txt
> User-agent: *
> Disallow: /
> C:\r>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]