lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Since Lynx won't, what will?


From: Foteos Macrides
Subject: Re: LYNX-DEV Since Lynx won't, what will?
Date: Sat, 09 Aug 1997 11:34:29 -0500 (EST)

David Woolley <address@hidden> wrote:
>>      My hope is that a number of  webmasters and sysadmins will use
>>      Lynx   as  their prefered   "web  automation"  tool  and  will
>>      consequently put furhter pressure  on their designers  to have
>
>Please don't even think of marketing anything like this to this market 
>unless it fully supports robots.txt (even if there are non-default
>options to defeat it).  Your proposed market is not going bother
>with enforcing such restrictions manually.

        At last!  Thank you David.  The robots.txt issue has been
stressed in all previous discussions of the traversal feature, and
the lynx-dev regulars should stress it again whenever new folks raise
discussions about the traversal feature.

        That feature has two purposes:

        (1) With -crawl it allows you to "traverse" and make temporary
hardcopies of dynamic documents returned by one of your *own* site's
scripts and/or SSI procedures, so that with a followup script you can
index the resultant lnk###########.dat files and use their THE_URL and
THE_TITLE entries to create a search database for the dynamic (i.e.,
non-existent until the CGI script creates them or SSI procedure creates
and/or combines the bits and pieces) documents. 

        (2) To help track down bad links in your *own* site's http
served documents.


        In both cases, you are likely to be traversing realms which
are blocked for general web crawlers via your robots.txt file, and
that is why Lynx uses it's own TRAVERSE_REJECT_FILE (reject.dat by
default) rather than your robots.txt file.


        You should never, ever, use Lynx to traverse a site other
than your own unless you have fetched its robots.txt file and
created an equivalent TRAVERSE_REJECT_FILE.

                                Fote

=========================================================================
 Foteos Macrides            Worcester Foundation for Biomedical Research
 address@hidden         222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]