bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] unexpected behaviour of wget on some long links


From: Yiwei Yang
Subject: [Bug-wget] unexpected behaviour of wget on some long links
Date: Wed, 12 Jun 2013 17:24:38 -0500

Hi,
    I wrote a c program and read a list of URLs and feed into wget one by
one with the following command:

   wget -p -np -nc -nd --delete-after -t 1 -T 20 -P somefolder <url>

However, with some long links, like:
http://www.linkedin.com/nhome/nus-redirect?url=http%3A%2F%2Fwww%2Elinkedin%2Ecom%2Fprofile%2Fview%3Fid%3D86239627%26snapshotID%3D%26authType%3Dname%26authToken%3DRWgi%26ref%3DNUS%26goback%3D%252Enmp_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1%26trk%3DNUS-body-member-name&urlhash=-U2e&trkToken=action%3DviewMember%26pageKey%3Dmember-home%26contextId%3Dbf3a735f-6394-4304-a98b-3ee6fa4b6515%26distanceFromViewer%3D1%26aggregationType%3Dnone%26isPublic%3Dfalse%26verbType%3Dlinkedin%3Aconnect%26activityId%3Dactivity%3A5730101464181772288%26isDigested%3Dfalse%26isFolloweeOfPoster%3Dfalse%26actorType%3Dlinkedin%3Amember%26feedPosition%3D15%26actorId%3Dmember%3A86239627%26objectId%3Dmember%3A129413241%26rowPosition%3D1%26objectType%3Dlinkedin%3Amember

it will show me finished the fetching but it will just block there until I
hit enter, but then the whole program will exit without proceeding to the
next link.

Another situation is I might get HTTP 404 error, for example, from:

https://www.google.com/url?url=https://plus.google.com/118428821259931683184/about%3Fhl%3Den%26socfid%3Dweb:lu:result:writeareviewplusurl%26socpid%3D1&rct=j&sa=X&ei=u11sUee0B9Pa2wWe-YGQAQ&ved=0CHAQ4gkwBw&q=usps&usg=AFQjCNFEjQ3SZNRXD6VNDQAjvOS2gXBYbw

or from
https://maps.google.com/maps?client=ubuntu&channel=fs&oe=utf-8&ie=UTF-8&q=usps&fb=1&gl=us&hq=usps&hnear=0x880cd7968484428f:0xf48dcbad390c6541,Urbana,+IL&ei=2_a1UaDNOsnDqQHD8oHwCg&ved=0CMABELYD

And -p will fetch from some other links and sometimes I get  HTTP400 or
HTTP500 errors(this situation increases if I add -H in the command),

So my question is:
Is there any restrictions on what kind of links could I use wget on? But if
I use -p, it will try to fetch other links that I don't have control, so is
there way to not to fetch links that will get HTTP errors so that my
program won't crash?

Thank you very much!

Lucy


reply via email to

[Prev in Thread] Current Thread [Next in Thread]