lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev Now I got the error from idmb trying to get StudioBrief from cr


From: mattack
Subject: lynx-dev Now I got the error from idmb trying to get StudioBrief from cron job.. here's the message
Date: Tue, 20 Jul 1999 16:06:29 -0700 (PDT)

Here's my crontab entry that's been working fine 
0 16 * * 1-5 /usr/local/bin/lynx -useragent='Mozilla/4.04 [en]' -nolist -dump 
http://www.imdb.com/StudioBrief | mail address@hidden



See, at the bottom it says the useragent is blank even though I set it above..

And nothing's changed on my end since yesterday..

/home/mattack % uname -a
SunOS vax 5.6 sun4u sparc

/home/mattack % /usr/local/bin/lynx -version


Lynx Version 2.8.2rel.1 (01 Jun 1999)
Built on solaris2.6 Jul 10 1999 04:03:41


Got any ideas?


---------- Forwarded message ----------
Date: 20 Jul 1999 23:00:07 -0000
From: address@hidden
Cc: recipient list not shown:  ;

                               Access denied
                                 http: 403
                                      
  Sorry, this imdb.com (Internet Movie Database) URL is not accessible from
  your address or browser.
  
   The three main reasons for seeing this message are:
    1. you've typed or followed some broken URL.
       
     Returning to the [1]home page should get you started.
    2. your address or browser or proxy server has been banned for
       [2]misuse of our service, e.g. overloading our servers with
       automated requests.
       
     pfizer.com addresses are blocked because of a dumb robot hit us for
     1.3 million requests in under 11 hours and brought down one of our
     servers.
     All requests from NetSonic, NetJet, NetCarta, Autonomy, WebWhacker,
     FlashSite, Java102, Teleport-Pro, MemoWeb, Microsoft's
     MS-Catapult/0.9, Netscape's Catalog-Robot, Microsoft's "Site
     Analyst" are rejected because of persistent attempts to download
     huge numbers of URLs as fast as the networks permit.
     Personal web crawlers (e.g. the built-in crawler modes from MSIE
     4.0 and Netscape 4.0) are not welcome on this site. MSIE 4.0 is
     more responsible in that it sometimes identifies itself differently
     when in crawler mode, so we can block crawling with it, Netscape
     4.0 doesn't identify itself as being in crawler mode so we can't
     detect it - instead we have to use more extreme measures to block
     abusers of our service - IP addresses of these abusers can be
     blocked completely once they misbehave.
     So called 'web-accelerators' can be a nuisance and we don't want
     them here. We can't block these programs but what we can do is send
     this page instead of whatever the software asked for every time.
     Please disable your URL prefetching 'web accelerators' to avoid
     loss of service.
     Filtering IMDb isn't allowed, so WebWasher filters etc aren't
     allowed. If everyone used a banner ad filter the vast majority of
     sites would run out of money to provide you with 'free' services...
     it's either banner ads or subscription fees.
    3. your browser or proxy server isn't identifying itself and is
       blocked because we don't trust it to behave.
       
     If your browser/proxy/agent does not send a HTTP User-Agent header,
     this server will reject the request. Most anonymous software turns
     out to be automated junk that hits us too fast and too hard. This
     deprives our online users of scarce resources. Such denials of
     service (intentional or not) will result in loss of access to our
     site.
     _________________________________________________________________
   
  Robot/crawler guidelines:
  
   Please see
   [3]http://info.webcrawler.com/mak/projects/robots/robots.html for
   background information and hints on writing well behaved
   browsers/robots/crawlers.
     _________________________________________________________________
   
  Reporting problems:
  
   If you cannot work out why you've seen this page, please report
   problems to 'address@hidden', include the items listed below in any
   email.
   Browser/Proxy identification (User-Agent) = '' ([4]blank - CLICK HERE)
   IP address=165.90.20.127
   URL=/StudioBrief



reply via email to

[Prev in Thread] Current Thread [Next in Thread]