lynx-dev Traversal Probelm part2

lynx-dev

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev Traversal Probelm part2

From:	Jeff Crane
Subject:	lynx-dev Traversal Probelm part2
Date:	Wed, 16 Feb 2000 14:34:34 -0800 (PST)

My real goal is to traverse ALL of
http://dir.yahoo.com

I dont need actualy pages, just organizational trees.
Today I tried-

lynx -error_file=lynx.err -anonymous -cookies
-traverse -cache=999999 -localhost
http://dir.yahoo.com/Arts/ > /dev/null &

and a number of variations. Inevitably the recursion
runs into a link that it will not recognize as visited
(infintely visiting it and adding it to traverse.dat
and traverse2.dat, respectively, over and over and
over again). As you can imagine, the process then runs
out of control eating more and more %CPU and RAM
(first it fills up RAM with the cache, then starts
sucking %CPU).

I would like to know if there is a way to store this
kind of information (the directory structure of
dir.yahoo.com as individual paths in a text file)
using wget or if there is a lynx flag I'm forgetting;
perhaps it's a lynx bug?
__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com

[Prev in Thread]

Current Thread

[Next in Thread]

lynx-dev Traversal Probelm part2, Jeff Crane <=
- Re: lynx-dev Traversal Probelm part2, Philip Webb, 2000/02/19
- Re: lynx-dev Traversal Probelm part2, Klaus Weide, 2000/02/20

Prev by Date: lynx-dev FTP vs HTTP
Next by Date: lynx-dev lynx download problems
Previous by thread: lynx-dev Re: Lynx brower on the HP3000
Next by thread: Re: lynx-dev Traversal Probelm part2
Index(es):
- Date
- Thread