[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Mirror a website but no sites with special chars like "?"
From: |
Paul Wratt |
Subject: |
Re: [Bug-wget] Mirror a website but no sites with special chars like "?" |
Date: |
Mon, 19 Mar 2012 20:45:05 +1300 |
because you are doing a recursive download you can manipulate a robots.txt
the easiest way to get it going is:
wget -r url/image.png <= or gif/etc
this will build the folder structure and should get the servers robots.txt file
edit the file to exclude the other unwanted urls
then do your actual recursive mirror
just google example robots.txt to hack what you need
Paul
On Mon, Mar 19, 2012 at 3:25 AM, Tobias Krais <address@hidden> wrote:
> Hi together,
>
> I want to mirror a wiki. For this I use the command
> wget -e robots=off -r -k -p -E -N -l inf intranet/mywiki/
>
> The request takes a long time, because for each site of the wiki exists
> a edit, upload, ... function. All these "unwanted" sites have one thing
> in common: the URL contains a "?".
>
> My question: Is it possible exclude sites from the download? If yes, how
> can I do it?
>
> You help is highly appreciated!
>
> Greetings,
>
> Tobias
>