bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] downloading all webpages, recursively, that have a given


From: Micah Cowan
Subject: Re: [Bug-wget] downloading all webpages, recursively, that have a given pattern
Date: Fri, 20 Mar 2009 23:01:36 -0700
User-agent: Thunderbird 2.0.0.19 (X11/20090105)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Edward Peschko wrote:
> all,
> 
> I was wondering if it was possible using wget to download all files on
> a web page with a given pattern.. I see the 'allowed directories' tag,
> and the 'excluded directories' tag; these seem to be a subset of the
> feature that I'm looking for.
> 
> Anyways, I'm not on the mailing list, so if you could CC me in any
> response, I'd appreciate it. I'm using 1.11.4..

You may perhaps have missed the --accept and --reject options (which
work on the filename portions).

There are some annoying issues with those as well, though: they will
never prevent the download of links ending in .htm or .html (though it
may dictate that they be deleted afterwards). Also, it will not match
against the query string portion of a URL (anything from the question
mark (?) on), but on the downloaded file it will (which can also cause
deletions after download).

Read the info manual (also available on the web), not the man page, for
detailed information on Wget conditional downloads. See the section on
"Following Links".

- --
HTH,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
Maintainer of GNU Wget and GNU Teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknEgsAACgkQ7M8hyUobTrGj1ACdFdMq+GbEOb+QRxf3ASfAtWdv
ceMAnjtMzE/lktLl0KjQAllu/r9cOgz8
=Uphg
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]