[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Sks-devel] robots.txt, grub-client
From: |
Yaron Minsky |
Subject: |
Re: [Sks-devel] robots.txt, grub-client |
Date: |
Thu, 23 Dec 2004 22:14:16 -0500 |
Jason, do you have any suggestions as to how SKS could be extended to
block inappropriate requests?
y
On Sat, 18 Dec 2004 15:52:32 -0500, Jason Harris <address@hidden> wrote:
>
> Is anyone (else) serving robots.txt from pks and SKS and watching the
> User-Agent: headers on incoming requests? I've noticed a lot (30 and
> counting, since yesterday afternoon) of requests from grub-client-2.3
> to my pks server, which is wrong because I've been serving robots.txt
> containing:
>
> User-agent: *
> Disallow: /
>
> for quite some time now. grub[.org] seems to be the newest search engine
> that doesn't respect robots.txt, but it is also hard to block because it
> is a distributed system. Still, 64.241.242.18=sv-fw.looksmart.com is the
> main offender and can be blocked by IP.
>
> Of course, M$ in 65.52.0.0/14 and 207.68.128.0 - 207.68.207.255 and
> Yahoo/Inktomi in 66.196.64.0/18 are also blocked by IP due to over-
> zealous web crawlers and/or not respecting robots.txt.
>
> Most of the grub requests have been for "Host: skylane.kjsl.com:11371"
> as well. The few for "Host: wwwkeys.pgp.net:11371" are understandable
> because it is a DNS RR, of course, but I imagine the remaining servers
> in wwwkeys.pgp.net (and other DNS RRs) that don't block these crawlers
> will see their bot-induced load eventually rise to unacceptable levels.
>
> --
> Jason Harris | NIC: JH329, PGP: This _is_ PGP-signed, isn't it?
> address@hidden _|_ web: http://keyserver.kjsl.com/~jharris/
> Got photons? (TM), (C) 2004
>
>
> _______________________________________________
> Sks-devel mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/sks-devel
>
>
>
>