[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
#852 - stop google from indexing old docs
From: |
Mark Polesky |
Subject: |
#852 - stop google from indexing old docs |
Date: |
Wed, 16 Dec 2009 00:48:03 -0800 (PST) |
This should be trivial to fix. I would do it but I can't
figure out from the sources how `robots.txt' is generated.
IIUC (correct me if I'm wrong), only the following rules
need to be followed:
1) The only valid locations for blank lines are *above* a
"User-agent" line and below the last "Disallow" line in a
single "User-agent" record. Remove all other blank
lines.
2) Individually disallow *all* directories that are
immediately below the /doc/ directory EXCEPT the one for
the current stable release. Ideally this would be
automated by a script.
So since the current stable version is 2.12 (and there
doesn't seem to be a 1.7 version), the file would look like
this:
User-agent: *
Disallow: /doc/v1.6/
Disallow: /doc/v1.8/
Disallow: /doc/v1.9/
Disallow: /doc/v2.0/
Disallow: /doc/v2.1/
Disallow: /doc/v2.2/
Disallow: /doc/v2.3/
Disallow: /doc/v2.4/
Disallow: /doc/v2.5/
Disallow: /doc/v2.6/
Disallow: /doc/v2.7/
Disallow: /doc/v2.8/
Disallow: /doc/v2.9/
Disallow: /doc/v2.10/
Disallow: /doc/v2.11/
Disallow: /doc/v2.13/
*********************
There is an alternative, which may be easier to maintain
(and thus safer). Maybe there are reasons that this would
be a bad idea (I don't know), but we could move the current
stable docs into a new subdirectory of /doc/ (like
/doc/current/) and move everything else to another
subdirectory (like /doc/other). Then the robots.txt file
would only need to be:
User-agent: *
Disallow: /doc/other/
Perhaps a script could be written to make sure only the
current docs live in /doc/current/ and everything else goes
to /doc/other/.
- Mark
- #852 - stop google from indexing old docs,
Mark Polesky <=
Re: #852 - stop google from indexing old docs, Han-Wen Nienhuys, 2009/12/16