[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] unexpected behaviour of wget on some long links
From: |
Yiwei Yang |
Subject: |
[Bug-wget] unexpected behaviour of wget on some long links |
Date: |
Wed, 12 Jun 2013 17:24:38 -0500 |
Hi,
I wrote a c program and read a list of URLs and feed into wget one by
one with the following command:
wget -p -np -nc -nd --delete-after -t 1 -T 20 -P somefolder <url>
However, with some long links, like:
http://www.linkedin.com/nhome/nus-redirect?url=http%3A%2F%2Fwww%2Elinkedin%2Ecom%2Fprofile%2Fview%3Fid%3D86239627%26snapshotID%3D%26authType%3Dname%26authToken%3DRWgi%26ref%3DNUS%26goback%3D%252Enmp_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1%26trk%3DNUS-body-member-name&urlhash=-U2e&trkToken=action%3DviewMember%26pageKey%3Dmember-home%26contextId%3Dbf3a735f-6394-4304-a98b-3ee6fa4b6515%26distanceFromViewer%3D1%26aggregationType%3Dnone%26isPublic%3Dfalse%26verbType%3Dlinkedin%3Aconnect%26activityId%3Dactivity%3A5730101464181772288%26isDigested%3Dfalse%26isFolloweeOfPoster%3Dfalse%26actorType%3Dlinkedin%3Amember%26feedPosition%3D15%26actorId%3Dmember%3A86239627%26objectId%3Dmember%3A129413241%26rowPosition%3D1%26objectType%3Dlinkedin%3Amember
it will show me finished the fetching but it will just block there until I
hit enter, but then the whole program will exit without proceeding to the
next link.
Another situation is I might get HTTP 404 error, for example, from:
https://www.google.com/url?url=https://plus.google.com/118428821259931683184/about%3Fhl%3Den%26socfid%3Dweb:lu:result:writeareviewplusurl%26socpid%3D1&rct=j&sa=X&ei=u11sUee0B9Pa2wWe-YGQAQ&ved=0CHAQ4gkwBw&q=usps&usg=AFQjCNFEjQ3SZNRXD6VNDQAjvOS2gXBYbw
or from
https://maps.google.com/maps?client=ubuntu&channel=fs&oe=utf-8&ie=UTF-8&q=usps&fb=1&gl=us&hq=usps&hnear=0x880cd7968484428f:0xf48dcbad390c6541,Urbana,+IL&ei=2_a1UaDNOsnDqQHD8oHwCg&ved=0CMABELYD
And -p will fetch from some other links and sometimes I get HTTP400 or
HTTP500 errors(this situation increases if I add -H in the command),
So my question is:
Is there any restrictions on what kind of links could I use wget on? But if
I use -p, it will try to fetch other links that I don't have control, so is
there way to not to fetch links that will get HTTP errors so that my
program won't crash?
Thank you very much!
Lucy
- [Bug-wget] unexpected behaviour of wget on some long links,
Yiwei Yang <=