bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] What ought to be a simple use of wget


From: Matthew White
Subject: Re: [Bug-wget] What ought to be a simple use of wget
Date: Wed, 3 Aug 2016 19:37:44 +0200

On Wed, 03 Aug 2016 11:46:22 -0400
address@hidden (Dale R. Worley) wrote:

> Matthew White <address@hidden> writes:
> > wget --recursive                               \
> >      --page-requisites                         \
> >      --convert-links                           \
> >      --domains="www.iana.org"                  \
> >      --reject "robots.txt","reports","contact" \
> >      
> > --exclude-directories="/go,/assignments,/_img,/_js,/_css,/domains,/performance,/about,/protocols,/procedures,/dnssec,/reports,/help,/abuse,/numbers,/reviews,/time-zones,/2000,/2001"
> >  \
> > http://www.iana.org/assignments/index.html
> 
> True, using --exclude-directories I can isolate what I want, but as you
> note, that requires actually knowing all of the children of the root in
> advance.  Whereas it seems to me that there should be a straightforward
> way of instructing wget to exclude "everything but X".
> 
> > wget --recursive              \
> >      --no-clobber             \
> >      --page-requisites        \
> >      --adjust-extension       \
> >      --convert-links          \
> >      --span-hosts             \
> >      --domains="www.iana.org" \
> >      http://www.iana.org/assignments/index.html
> 
> As you said, that command returned lots of things that aren't in
> http://www.iana.org/assignments.
> 
> Dale

Hi Dale!

Quick update.

I'm trying the first command you mentioned in "reverse" with a combination of 
-A, -R, --accept-regex, --reject-regex, -I, and -X. 

Still no good results for "exclude all, include this and that".

[to build an exclude/include list you need to experiment a little]

Later,
Matthew

-- 
Matthew White <address@hidden>

Attachment: pgpTmYEZaCb3L.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]