bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] wget not stop when using -e robots=off option


From: Sethi Badhan
Subject: [Bug-wget] wget not stop when using -e robots=off option
Date: Sun, 27 Nov 2016 17:40:09 -0800

Hello

when i try to run simply wget in for loop it works fine but when i try to
run using -e robots=off it not stopping and it downloading pages
recursively even i have set the limit for 'for ' loop it is not stoping
after that limit here is my code

#!/bin/bash

lynx --dump  https://en.wikipedia.org/wiki/Cloud_computing |awk
'/http/{print $2}'| grep https://en. | grep -v
'.svg\|.png\|.jpg\|.pdf\|.JPG\|.php' >Pages.txt
grep -vwE "(
http://www.enterprisecioforum.com/en/blogs/gabriellowy/value-data-platform-service-dpaas)"
Pages.txt > newpage.txt
rm Pages.txt
egrep -v "#|$^" newpage.txt>try.txt
awk '!a[$0]++' try.txt>new.txt
rm newpage.txt
rm try.txt
mkdir -p htmlpagesnew
cd htmlpagesnew
j=0
for i in $( cat ../new.txt );
do
if [ $j -lt 10 ];
then
    let j=j+1;
    echo $j
    wget  -N -nd -r $i -e robots=off --wait=.25 ;
fi
done
 i hope you will reply soon.
Thanks
Regards
Sethi


reply via email to

[Prev in Thread] Current Thread [Next in Thread]