[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Bug in <meta name="robots" content="nofollow" />

From: Micah Cowan
Subject: Re: [Bug-wget] Bug in <meta name="robots" content="nofollow" />
Date: Thu, 04 Mar 2010 14:52:55 -0800
User-agent: Thunderbird (X11/20090817)

Augustin, Stefan wrote:
> Hello,
> I want to crawle a web site which uses
>  <meta name="robots" content="nofollow" />
> in the HTML HEAD,
> which should be XTHML instead of plain HTML.
> But wget seems to ignore this control information.
> Unfortunately, I can't change the code in the HTML pages of this web server.

If I understand you correctly, I think you meant that "wget seems to
obey this control information", otherwise, what would be preventing you
from crawling a web site?

Have a look at
http://wget.addictivecode.org/FrequentlyAskedQuestions#robots for the

Micah J. Cowan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]