lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lynx-dev] Lynx fails on http://politiken.dk


From: David Woolley
Subject: Re: [Lynx-dev] Lynx fails on http://politiken.dk
Date: Sat, 30 Oct 2004 09:57:01 +0100 (BST)

> >>  > User-Agent: Links (2.1pre15; MirBSD 7 i386; 113x20)

This one is broken; the version number should be outside the comment
and separated by a / from the agent name.
 
> >> (2) where did the libwww in his UA-string come *from*?
> 
> Stock Lynx.

In particular, Lynx is probably the only desk top browser that uses
User-Agent properly, although some WAP browsers seem to do so.

The problem of discrimination against libwww came up recently on the
address@hidden mailing list.  My guess is that this is a problem with
some browser capabilities package that is giving a positive indication that
some bells and whistles feature is missing, and the application level code
is rejecting because of the lack of that feature.

The other likely possibility is that it is being perceived as an automated
download tool.   Whilst Lynx is sometimes used for this, it often against
explicit rules in the permitted use for commercial sites (ones with
entertainment, or real information, paid for by advertising, typically).
Some such sites will take active measures to block such tools.  They 
will probably tolerate some loss of advertising to manual text only 
accesses, but not any attempt to extract large amounts of information
without a human being presented with the adverts.

Another possible factor is security.  If you are accessing a secure site,
they, or their insurance company, may insist that there is someone to
attempt to sue if the security is compromised because of a flaw in the
SSL implementation (that should trigger on the SSL string, though, for
Lynx - older Lynx SSL implementations failed to authenticate the site).

Yet another possibility is that some organisations feed keywords stuffing
pages to search engines, and might have mistaken the unusual User-Agent
string for a search engine crawler.  This is considered unethical by 
search engine operators, of course.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]