bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] CNET download links not working with WGET


From: Micah Cowan
Subject: Re: [Bug-wget] CNET download links not working with WGET
Date: Thu, 26 May 2011 15:52:13 -0700
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8

So... looks like it works, then. Your command shell isn't complaining
about weird command names, wget is clearly requesting the full and
correct URL, it follows redirections, and saves using the final
redirection URL (the latest sources wouldn't follow that last step -
it'd save using the request URI by default).

If you dislike the filename, then provided you have a recent enough
version of wget you can add the --content-disposition option if the
server provides a rename header ("Content-Disposition"); or else use -E
to have wget force the file name to end in .html

-mjc

(05/26/2011 12:19 PM), Jeff Givens wrote:
> Hi, I know this is an older topic but thanks for replying.  I forgot to
> mention I had already what you listed below and this is the output I get:
> 
> C:\DOWNLOAD>wget "http://dw.com.com/redir?edId=3&siteId=4&oId=3000-8022_
> 4-10804572&ontId=8022_4&spi=077d9109e846975d0db9532bd610588f&lop=link&tag=tdw_dl
> 
> text&ltype=dl_dlnow&pid=11665648&mfgId=6290020&merId=6290020&pguid=HFsQLwoOYJQAA
> 
> BuImQcAAAGm&destUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3
> 
> Fspi%3D077d9109e846975d0db9532bd610588f"
> --2011-05-02 12:34:20-- 
> http://dw.com.com/redir?edId=3&siteId=4&oId=3000-8022_4
> -10804572&ontId=8022_4&spi=077d9109e846975d0db9532bd610588f&lop=link&tag=tdw_dlt
> 
> ext&ltype=dl_dlnow&pid=11665648&mfgId=6290020&merId=6290020&pguid=HFsQLwoOYJQAAB
> 
> uImQcAAAGm&destUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572.html%3F
> 
> spi%3D077d9109e846975d0db9532bd610588f
> Resolving dw.com.com... 216.239.113.95
> Connecting to dw.com.com|216.239.113.95|:80... connected.
> HTTP request sent, awaiting response... 302 Found
> Location:
> http://download.cnet.com/3001-8022_4-10804572.html?spi=077d9109e846975
> d0db9532bd610588f [following]
> --2011-05-02 12:34:21-- 
> http://download.cnet.com/3001-8022_4-10804572.html?spi=
> 077d9109e846975d0db9532bd610588f
> Resolving download.cnet.com... 64.30.224.58
> Connecting to download.cnet.com|64.30.224.58|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: unspecified [text/html]
> Saving to:
> address@hidden'
> 
>     [ <=>                                ] 69,240      77.3K/s   in 0.9s
> 
> 2011-05-02 12:34:22 (77.3 KB/s) -
> address@hidden
> d0db9532bd610588f.1' saved [69240]
> 
> 
> C:\DOWNLOAD>
> 
> Thanks for your help.
> 
> -    Jeff
> 
> 
> 
>> hello,
>>
>> the "&" character in the url is interpreted by your shell.
>>
>> Try using something like:
>>
>> wget "URL"
>>
>> Cheers,
>> Giuseppe
>>
>>
>>
>> "Jeff Givens"<address@hidden>  writes:
>>
>>> Hello, I am having an issue downloading files via download links from
>>> CNET.  It appears to locate some of the URL but stops at the first
>>> &siteId part.  I have included the debug information as well.  Thanks
>>> in advance for your help.
>>>
>>> C:\>DOWNLOAD\wget http://dw.com.com/redir?edId=3&siteId=4&oId=300
>>> 0-8022_4-10804572&ontId=8022_4&spi=077d9109e846975d0db9532bd610588f&lop=link&tag
>>>
>>> =tdw_dltext&ltype=dl_dlnow&pid=11665648&mfgId=6290020&merId=6290020&pguid=HFsQLw
>>>
>>> oOYJQAABuImQcAAAGm&destUrl=http%3A%2F%2Fdownload.cnet.com%2F3001-8022_4-10804572
>>>
>>> .html%3Fspi%3D077d9109e846975d0db9532bd610588f
>>> --2011-04-19 11:30:35-- http://dw.com.com/redir?edId=3
>>> Resolving dw.com.com... 216.239.113.95
>>> Connecting to dw.com.com|216.239.113.95|:80... connected.
>>> HTTP request sent, awaiting response... 302 Found
>>> Location: http://dw.com.com/redir/redx/?edId=3 [following]
>>> --2011-04-19 11:30:36-- http://dw.com.com/redir/redx/?edId=3
>>> Reusing existing connection to dw.com.com:80.
>>> HTTP request sent, awaiting response... 404 Not Found
>>> 2011-04-19 11:30:36 ERROR 404: Not Found.
>>>
>>> 'siteId' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'oId' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'ontId' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'spi' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'lop' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'tag' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'ltype' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'pid' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'mfgId' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'merId' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'pguid' is not recognized as an internal or external command,
>>> operable program or batch file.
>>> 'destUrl' is not recognized as an internal or external command,
>>> operable program or batch file.
>>>
>>> DEBUG output created by Wget 1.11.4 on Windows-MSVC.
>>>
>>> --2011-04-19 11:27:09-- http://dw.com.com/redir?edId=3
>>> Resolving dw.com.com... seconds 0.00, 64.30.224.42
>>> Caching dw.com.com =>  64.30.224.42
>>> Connecting to dw.com.com|64.30.224.42|:80... seconds 0.00, connected.
>>> Created socket 340.
>>> Releasing 0x01411158 (new refcount 1).
>>>
>>> ---request begin---
>>> GET /redir?edId=3 HTTP/1.0
>>>
>>> User-Agent: Wget/1.11.4
>>>
>>> Accept: */*
>>>
>>> Host: dw.com.com
>>>
>>> Connection: Keep-Alive
>>>
>>>
>>>
>>> ---request end---
>>> HTTP request sent, awaiting response...
>>> ---response begin---
>>> HTTP/1.1 302 Found
>>>
>>> Date: Tue, 19 Apr 2011 15:27:26 GMT
>>>
>>> Server: Apache/2.0
>>>
>>> Pragma: no-cache
>>>
>>> Cache-control: no-cache, must-revalidate, no-transform
>>>
>>> Vary: *
>>>
>>> Expires: Fri, 23 Jan 1970 12:12:12 GMT
>>>
>>> Set-Cookie: XCLGFbrowser=Cg5iVk2tqd6JAAAA8Sg; expires=Sun, 18-Apr-2021
>>> 15:27:26 GMT; domain=.com.com; path=/
>>>
>>> Location: http://dw.com.com/redir/redx/?edId=3
>>>
>>> Content-Length: 0
>>>
>>> P3P: CP="CAO DSP COR CURa ADMa DEVa PSAa PSDa IVAi IVDi CONi OUR OTRi
>>> IND PHY ONL UNI FIN COM NAV INT DEM STA"
>>>
>>> Keep-Alive: timeout=363, max=760
>>>
>>> Connection: Keep-Alive
>>>
>>> Content-Type: text/plain
>>>
>>>
>>>
>>> ---response end---
>>> 302 Found
>>> Registered socket 340 for persistent reuse.
>>> cdm: 1 2 3 4 5 6 7 8
>>> Stored cookie com.com -1 (ANY) /<permanent>  <insecure>  [expiry
>>> 2021-04-18 11:27:26] XCLGFbrowser Cg5iVk2tqd6JAAAA8Sg
>>> Location: http://dw.com.com/redir/redx/?edId=3 [following]
>>> Skipping 0 bytes of body: [] done.
>>> --2011-04-19 11:27:09-- http://dw.com.com/redir/redx/?edId=3
>>> Reusing existing connection to dw.com.com:80.
>>> Reusing fd 340.
>>>
>>> ---request begin---
>>> GET /redir/redx/?edId=3 HTTP/1.0
>>>
>>> User-Agent: Wget/1.11.4
>>>
>>> Accept: */*
>>>
>>> Host: dw.com.com
>>>
>>> Connection: Keep-Alive
>>>
>>> Cookie: XCLGFbrowser=Cg5iVk2tqd6JAAAA8Sg
>>>
>>>
>>>
>>> ---request end---
>>> HTTP request sent, awaiting response...
>>> ---response begin---
>>> HTTP/1.1 404 Not Found
>>>
>>> Date: Tue, 19 Apr 2011 15:27:26 GMT
>>>
>>> Server: Apache/2.0
>>>
>>> Content-Length: 209
>>>
>>> Keep-Alive: timeout=363, max=779
>>>
>>> Connection: Keep-Alive
>>>
>>> Content-Type: text/html; charset=iso-8859-1
>>>
>>>
>>>
>>> ---response end---
>>> 404 Not Found
>>> Skipping 209 bytes of body: [<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML
>>> 2.0//EN">
>>> <html><head>
>>> <title>404 Not Found</title>
>>> </head><body>
>>> <h1>Not Found</h1>
>>> <p>The requested URL /redir/redx/ was not found on this server.</p>
>>> </body></html>
>>> ] done.
>>> 2011-04-19 11:27:09 ERROR 404: Not Found.
> 


-- 
Micah J. Cowan
http://micah.cowan.name/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]