[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] a note about "--reject" command line switch does not affect h

From: Dmitry Bolshakov
Subject: [Bug-wget] a note about "--reject" command line switch does not affect html files
Date: Fri, 03 Dec 2010 15:32:12 +0300

Hi all

it's just one example about this

when I wanted to mirror one wordpress-based site (http://media-mera.ru) I have 
noticed that the process takes too much time
I have found that the reason is "reply to" links which are provided for every 
user comment:
actually these addreses is just the same page 
http://media-mera.ru/articles/socially_useful so I have added -R to the
command line
-R '*replytocom=*'
but nothing changed and after some googling I have found the note about wget 
Note that these two options do not affect the downloading of html files (as 
determined by a ‘.htm’ or ‘.html’ filename
prefix). This behavior may not be desirable for all users, and may be changed 
for future versions of Wget.
yes, I absolutely agree, it should be changed, judged by wget output the total 
downloaded traffic exceeds resulted
saved mirror in 10 times!

wget is running on this site 30 minutes, httrack - only 1,5

while was writing, I have found even special wordpress plugin which is intended 
to reduce traffic of "replytocom" links - 

with best regards
Dmitry Bolshakov

reply via email to

[Prev in Thread] Current Thread [Next in Thread]