lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

lynx-dev Traversal Probelm part2


From: Jeff Crane
Subject: lynx-dev Traversal Probelm part2
Date: Wed, 16 Feb 2000 14:34:34 -0800 (PST)

My real goal is to traverse ALL of
http://dir.yahoo.com

I dont need actualy pages, just organizational trees.
Today I tried-

lynx -error_file=lynx.err -anonymous -cookies
-traverse -cache=999999 -localhost
http://dir.yahoo.com/Arts/ > /dev/null &

and a number of variations. Inevitably the recursion
runs into a link that it will not recognize as visited
(infintely visiting it and adding it to traverse.dat
and traverse2.dat, respectively, over and over and
over again). As you can imagine, the process then runs
out of control eating more and more %CPU and RAM
(first it fills up RAM with the cache, then starts
sucking %CPU).

I would like to know if there is a way to store this
kind of information (the directory structure of
dir.yahoo.com as individual paths in a text file)
using wget or if there is a lynx flag I'm forgetting;
perhaps it's a lynx bug?
__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com

reply via email to

[Prev in Thread] Current Thread [Next in Thread]