lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev HTDoRead() HTTCP.c possible bug - retry limit set too high?


From: Vlad Harchev
Subject: Re: lynx-dev HTDoRead() HTTCP.c possible bug - retry limit set too high?
Date: Fri, 7 Jul 2000 23:03:59 +0500 (SAMST)

On Fri, 7 Jul 2000, Klaus Weide wrote:

> On Fri, 7 Jul 2000, Vlad Harchev wrote:
> > On Thu, 6 Jul 2000, jtnews wrote:
> > 
> > > I use lynx to dump URL text pages using lynx -dump and
> > > occasionally, lynx hangs in the following section of code...
> > > 
> > > HTTCP.c:
> > >     while (!ready) {
> > >         /*
> > >         **  Protect against an infinite loop.
> > >         */
> > > =>      if (tries++ >= 180000) {
> > >             HTAlert(gettext("Socket read failed for 180,000 tries."));
> > > 
> > > I'm using lynx 2.8.4dev4.
> > > I also ran into a similar situation using 2.8.3 also.
> > > 
> > > With the timeout in select set to 100000 microseconds   The call will
> > > only timeout
> > > after 180000 * 100000 / 1000000 = 18000 seconds or 5 hours.
> 
> What's wrong with that?  It is not a bug that lynx tries for a long time.
> 
> Ok, I can guess that you would like a shorter timeout.  But it's not
> lynx that is broken here - it's the network connection (or maybe the
> server).
> 
> >   Thanks for reporting this.
> > 
> >   Seems we should add new lynx.cfg setting READ_TIMEOUT to control this 
> > (there
> > already exists CONNECT_TIMEOUT). Does anybody object against it? 
> 
> As long as you keep the current behavior by default...

  I assume that by "current behaviour" you mean current value of timeout.
 
> But this isn't really the best solution for the problem, if the problem
> is really: 'Non-interactive lynx processes hang around for too long
> under some conditions'.  The best solution, because it should always
> work, is: kill the process from the outside if it runs for too long.
> Shortening the read timeout, together with a short connect timeout,
> will still not apply to all situations.  I am thinking about some
> situations in the FTP protocol (if we are blocked in listen(), neither
> of the timeout applies), or reading a "local" file that is actually
> on an NFS-mounted filesystem that is unavailable, or some other special
> local files.

  As for NFS mounts - it's very counterintuitive to use CONNECT_TIMEOUT and
READ_TIMEOUT for reading "plain files" (as they seem to libc).

 As for first part ("better use the script below") - we've discussed this
before. This won't work for crawling since lynx can't continue crawling from
the place it was interrupted. Also, if we use the script, we can only limit
the total time of the crawling session, not the timeout for each individual
document.
 Also the script won't work for MSDOS and will require 'kill', 'sh' for
other brain-damaged OSes or environments like Win* or OS/2 or Mac.
 
> Better learn how to kill a process so that it *never* can run longer
> than a max time.  Take the shell script below as a starting point.
> (It should be improved for real use, see for example documentation
> of {bash,ksh} builtin 'wait' command.) z
> 
>    Klaus
> 
> ----------------   tolynx   ---------------------
> #! /bin/sh
> LYNX="./src/lynx"  # change this, e.g. to "lynx"
> if [ $# -lt 2 ]; then
>   echo "tolynx - invoke lynx with timeout." >&2
>   echo "Usage: $0 TIMEOUT LYNXOPTIONS URL" >&2
>   echo "       LYNXOPTIONS should include -dump." >&2
>   exit 1
> fi
> TIMEOUT=$1
> shift
> $LYNX "$@" &
> CHILDPID=$!
> sleep $TIMEOUT
> kill -0 $CHILDPID 2>/dev/null && kill $CHILDPID
> exit 0
> 
> 
> 
> ; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden
> 

 Best regards,
  -Vlad


; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]