bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] downloading all webpages, recursively, that have a given


From: Keith Roberts
Subject: Re: [Bug-wget] downloading all webpages, recursively, that have a given pattern
Date: Sat, 21 Mar 2009 12:33:51 +0000 (GMT)
User-agent: Alpine 2.00 (LFD 1167 2008-08-23)

On Fri, 20 Mar 2009, Micah Cowan wrote:

To: Edward Peschko <address@hidden>
From: Micah Cowan <address@hidden>
Subject: Re: [Bug-wget] downloading all webpages, recursively,
    that have a given pattern

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Edward Peschko wrote:
all,

I was wondering if it was possible using wget to download all files on
a web page with a given pattern.. I see the 'allowed directories' tag,
and the 'excluded directories' tag; these seem to be a subset of the
feature that I'm looking for.

Anyways, I'm not on the mailing list, so if you could CC me in any
response, I'd appreciate it. I'm using 1.11.4..

You may perhaps have missed the --accept and --reject options (which
work on the filename portions).

There are some annoying issues with those as well, though: they will
never prevent the download of links ending in .htm or .html (though it
may dictate that they be deleted afterwards). Also, it will not match
against the query string portion of a URL (anything from the question
mark (?) on), but on the downloaded file it will (which can also cause
deletions after download).

Read the info manual (also available on the web), not the man page, for
detailed information on Wget conditional downloads. See the section on
"Following Links".

Re reading info pages:

There's a nice lynx-type info browser called pinfo that I'd recommend for browsing info pages in a logical order. Works very much like the lynx text-based web browser:

Pinfo - A lynx-style info and man reader

Pinfo is an info file viewer. It was created when the author, Przemek Borys, was very depressed trying to read gtk info entries using the standard tools.

Pinfo is similar in use to lynx. It has similar key movements, and gives similar intuition. You just move across info nodes, and select links, follow them... Well, you know how it is when you view html with lynx. :) It supports as many colors as it could.

Pinfo also supports viewing of manual pages -- they're colorised like in the midnight commander's viewer, and additionaly they are hypertextualized (i.e. when pinfo encounters a reference of form manualname (n), then you can press enter there, and voila -- you're on the page for `manualname'.

Keyboard and colors are fully configurable. Pinfo supports URL's embedded into info documents and man.

http://pinfo.sourceforge.net/

HTH

Keith Roberts

-----------------------------------------------------------------
Websites:
http://www.php-debuggers.net
http://www.karsites.net
http://www.raised-from-the-dead.org.uk

All email addresses are challenge-response protected with
TMDA [http://tmda.net]
-----------------------------------------------------------------




reply via email to

[Prev in Thread] Current Thread [Next in Thread]