lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Lynx-dev] Lynx as a HTML to text converter.


From: Jari Tuominen
Subject: [Lynx-dev] Lynx as a HTML to text converter.
Date: Tue, 2 Aug 2005 04:28:03 +0300

Hi

I am programming a Web crawler and an indexer.

I am implementing Lynx in converting HTML documents into text files, by using command "lynx -dump".

The problem is that it converts relative URLs to FILE:///db/www/... -stylish.

I am using Lynx in extracting links out of the HTML files, so I need to play around alot to convert those local URLs back to relative ones, which I can combine to the host name, therefore creating an absolute www- URL.

If you know any other program than Lynx which does these similar tasks at same performance, I would be interested to know, thanks...

Jari Tuominen
http://www.vunet.org






reply via email to

[Prev in Thread] Current Thread [Next in Thread]