bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] unexpected behaviour of wget on some long links


From: Yiwei Yang
Subject: Re: [Bug-wget] unexpected behaviour of wget on some long links
Date: Thu, 13 Jun 2013 11:39:42 -0500

Thanks  a lot. That blocking problem is because I didn't use quote to
surround the URL parameter. But now I just get the HTTP 403 errors and some
links would give me
HTTP request sent, awaiting response... 999 Request denied
2013-06-13 11:32:28 ERROR 999: Request denied.

For the 403 I can understand I don't have the authentication but how about
this 999? anyone has any idea about this?

I get this when I use it on
http://www.linkedin.com/nhome/nus-redirect?url=http%3A%2F%2Fwww%2Elinkedin%2Ecom%2Fprofile%2Fview%3Fid%3D129413241%26authType%3Dname%26authToken%3DEn-p%26ref%3DNUS%26goback%3D%252Enmp_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1%26trk%3Dmember-name&urlhash=9w81&trkToken=action%3DviewMember%26pageKey%3Dmember-home%26contextId%3Dbf3a735f-6394-4304-a98b-3ee6fa4b6515%26distanceFromViewer%3D1%26aggregationType%3Dnone%26isPublic%3Dfalse%26verbType%3Dlinkedin%3Aconnect%26activityId%3Dactivity%3A5730101464181772288%26isDigested%3Dfalse%26isFolloweeOfPoster%3Dfalse%26actorType%3Dlinkedin%3Amember%26feedPosition%3D15%26actorId%3Dmember%3A86239627%26objectId%3Dmember%3A129413241%26rowPosition%3D1%26objectType%3Dlinkedin%3Amemberember3Dlinkedin%3Amembermembere-snapshot%3Aprofile-snapshot30kK3EtWy1X-vq1DkfY797Q3W9Y6791uHdPrsrmDryeuQMyWNKh47XlVVBB9lBCA_hpPgX8mEpqkcGMTeMLvM10cqtWmLA5YKT1xPQG7P1lDn8MZkPLf0FTEgm7xW9RjceO2JmZyO2zqTex1u7NpQiwnKAp4XUep4t59IWc5lcXYp_MV6B6SEr4WvOnYdalvu9gvllJS0LBlB_4RUOMuTVMFJW0i0c-4EOfIAdL9qOrczVpMq4yvGSGgBrzj6orP3KYJVneCkMEZOEXPG_s9HTqJtrsVRE0ZVWbg3PDwaHRryNXeHTHzqdzYM7gzPLmXc1lfFBlL_qYafpvFTp910nBBNrGf3fEHHJbt2818nfhakKUnHqhd77uSITXJOFmOLql&f=1&ui=6006132918378-id_51719e14b69317817122009&en=fad_pageclick&ed=430769983649769&a=0&mac=AQLw1fIgW7ipGNay&sig=11892568d38150022669&en=1&a=0&sig=118242tyl86tWHO9CfXsitFKritu8ojwEckwztCO8U6hOplMZNfHz76jiFnA1ZVokg&f=1&ui=6007244462984-id_51719e14b68a35899364649&en=1&a=0&sig=962419

Thank you!
Yiwei



On Thu, Jun 13, 2013 at 10:32 AM, Darshit Shah <address@hidden> wrote:

> Bykov's suggestion is bang on accurate.
>
> The issue you are facing is that the ampersand (&) is a special character
> in the Bash Shell that asks the shell to run the command in the background
> and return control of the Shell to the user.
> The shell is reading the & character in your URL and sending the command to
> the background. This is the expected outcome too. You should quote your URL
> with either of single ( ' ) or double ( " ) quotes to prevent the shell
> from processing that character.
>
> On Thu, Jun 13, 2013 at 8:52 PM, Bykov Aleksey <address@hidden> wrote:
>
> > Greetings, Yiwei Yang
> > Sorry for stupid question, but does You try to use qoutes to escape url?
> >
> > wget -p -np -nc -nd --delete-after -t 1 -T 20 -P somefolder "<url>"
> > or
> > wget -p -np -nc -nd --delete-after -t 1 -T 20 -P somefolder '<url>'
> > Shell can interpret ampersand as command separator...
> >
> > --
> > Best regars, Alex
> >
> >
> >  Hi,
> >>     I wrote a c program and read a list of URLs and feed into wget one
> by
> >> one with the following command:
> >>
> >>    wget -p -np -nc -nd --delete-after -t 1 -T 20 -P somefolder <url>
> >>
> >> However, with some long links, like:
> >> http://www.linkedin.com/nhome/**nus-redirect?url=http%3A%2F%**
> >> 2Fwww%2Elinkedin%2Ecom%**2Fprofile%2Fview%3Fid%**
> >> 3D86239627%26snapshotID%3D%**26authType%3Dname%26authToken%**
> >> 3DRWgi%26ref%3DNUS%26goback%**3D%252Enmp_*1_*1_*1_*1_*1_*1_***
> >> 1_*1_*1_*1%26trk%3DNUS-body-**member-name&urlhash=-U2e&**
> >> trkToken=action%3DviewMember%**26pageKey%3Dmember-home%**
> >> 26contextId%3Dbf3a735f-6394-**4304-a98b-3ee6fa4b6515%**
> >> 26distanceFromViewer%3D1%**26aggregationType%3Dnone%**
> >> 26isPublic%3Dfalse%26verbType%**3Dlinkedin%3Aconnect%**
> >>
> 26activityId%3Dactivity%**3A5730101464181772288%**26isDigested%3Dfalse%**
> >> 26isFolloweeOfPoster%3Dfalse%**26actorType%3Dlinkedin%**
> >> 3Amember%26feedPosition%3D15%**26actorId%3Dmember%3A86239627%**
> >> 26objectId%3Dmember%**3A129413241%26rowPosition%3D1%**
> >> 26objectType%3Dlinkedin%**3Amember<
> http://www.linkedin.com/nhome/nus-redirect?url=http%3A%2F%2Fwww%2Elinkedin%2Ecom%2Fprofile%2Fview%3Fid%3D86239627%26snapshotID%3D%26authType%3Dname%26authToken%3DRWgi%26ref%3DNUS%26goback%3D%252Enmp_*1_*1_*1_*1_*1_*1_*1_*1_*1_*1%26trk%3DNUS-body-member-name&urlhash=-U2e&trkToken=action%3DviewMember%26pageKey%3Dmember-home%26contextId%3Dbf3a735f-6394-4304-a98b-3ee6fa4b6515%26distanceFromViewer%3D1%26aggregationType%3Dnone%26isPublic%3Dfalse%26verbType%3Dlinkedin%3Aconnect%26activityId%3Dactivity%3A5730101464181772288%26isDigested%3Dfalse%26isFolloweeOfPoster%3Dfalse%26actorType%3Dlinkedin%3Amember%26feedPosition%3D15%26actorId%3Dmember%3A86239627%26objectId%3Dmember%3A129413241%26rowPosition%3D1%26objectType%3Dlinkedin%3Amember
> >
> >>
> >> it will show me finished the fetching but it will just block there
> until I
> >> hit enter, but then the whole program will exit without proceeding to
> the
> >> next link.
> >>
> >> Another situation is I might get HTTP 404 error, for example, from:
> >>
> >> https://www.google.com/url?**url=https://plus.google.com/**
> >> 118428821259931683184/about%**3Fhl%3Den%26socfid%3Dweb:lu:**
> >> result:writeareviewplusurl%**26socpid%3D1&rct=j&sa=X&ei=**
> >> u11sUee0B9Pa2wWe-YGQAQ&ved=**0CHAQ4gkwBw&q=usps&usg=**
> >> AFQjCNFEjQ3SZNRXD6VNDQAjvOS2gX**BYbw<
> https://www.google.com/url?url=https://plus.google.com/118428821259931683184/about%3Fhl%3Den%26socfid%3Dweb:lu:result:writeareviewplusurl%26socpid%3D1&rct=j&sa=X&ei=u11sUee0B9Pa2wWe-YGQAQ&ved=0CHAQ4gkwBw&q=usps&usg=AFQjCNFEjQ3SZNRXD6VNDQAjvOS2gXBYbw
> >
> >>
> >> or from
> >> https://maps.google.com/maps?**client=ubuntu&channel=fs&oe=**
> >> utf-8&ie=UTF-8&q=usps&fb=1&gl=**us&hq=usps&hnear=**0x880cd7968484428f:**
> >>
> 0xf48dcbad390c6541,Urbana,+IL&**ei=2_a1UaDNOsnDqQHD8oHwCg&ved=**0CMABELYD<
> https://maps.google.com/maps?client=ubuntu&channel=fs&oe=utf-8&ie=UTF-8&q=usps&fb=1&gl=us&hq=usps&hnear=0x880cd7968484428f:0xf48dcbad390c6541,Urbana,+IL&ei=2_a1UaDNOsnDqQHD8oHwCg&ved=0CMABELYD
> >
> >>
> >> And -p will fetch from some other links and sometimes I get  HTTP400 or
> >> HTTP500 errors(this situation increases if I add -H in the command),
> >>
> >> So my question is:
> >> Is there any restrictions on what kind of links could I use wget on? But
> >> if
> >> I use -p, it will try to fetch other links that I don't have control, so
> >> is
> >> there way to not to fetch links that will get HTTP errors so that my
> >> program won't crash?
> >>
> >> Thank you very much!
> >>
> >> Lucy
> >>
> >
> >
>
>
> --
> Thanking You,
> Darshit Shah
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]