bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Problem, no getting any response


From: Ray Satiro
Subject: Re: [Bug-wget] Problem, no getting any response
Date: Sun, 22 Nov 2009 22:29:35 -0800 (PST)

There are a few problems with the way you are trying to retrieve the data. You 
will not be able to really imitate FF.  Wget is an HTTP/1.0 client. You can 
spoof your UA but you can't pull off a near-similar reproduction in reply 
because Wget does not support HTTP/1.1

I read your most recent reply which contains the command line you are using. 
You shouldn't send this with Wget:
--header= "Accept-Encoding: gzip,deflate" 
because IIS is going to see that and return gzip encoded data

You shouldn't send content-length, if you --post-data let Wget make the length 
determination

You don't need all those options you had to receive an xml response, just the 
right content type, json text.

Here is an example of what you probably? want. The cookies aren't required for 
the example and I have removed them for easier reading. This command was 
successful using wget-1.12.1-devel from vista command prompt. Escaping quotes 
will depend on your actual OS and shell.

wget 
--post-data="{\"searchQueryString\":\"p+2-n+12-c+287466-s+5-r+-t+-ri+-ni+1-x+\",\"isSearchMode\":false}"
 --no-cache --max-redirect=0 --header="Content-Type: application/json" 
http://www.tiffany.com/Shopping/CategoryBrowse.aspx/GetCategoriesXmlBySearchQS 
-O whatever.xml

test -s whatever.xml if so xml parse whatever.xml etc etc 


Good luck!

Jay



--- On Sat, 11/21/09, Dan Yamins <address@hidden> wrote:

From: Dan Yamins <address@hidden>
Subject: [Bug-wget] Problem, no getting any response
To: address@hidden
Date: Saturday, November 21, 2009, 6:52 PM

Hi,

I'm trying to use wget to scrape some data from a page that requires a
posting of some data (the page itself does it via Javascript).   When I use
the command:

$ wget --header="Content-length:84"
--post-data="searchQueryString=p-8-n+12-cg+viewPaged-c+287464-s+0-r+-t+-ri+-ni+1-x+-pu+-f+"
http://www.tiffany.com/Shopping/CategoryBrowse.aspx/GetCategoriesXmlBySearchQS-O
test.html

.... I never get a response and wget hangs.

My question is, even though I'm sending the exact same post as the browser
does when I view the page in Firefox (I looked at it in firebug), I guess I
must not be sending something right.  I've tried mimicking everything in the
request header, but no matter what, I always get the hang.

Is there something else I can do?  Something obvious I'm doing wrong?  (Am I
not posting the xml properly?)

Thanks!
Dan



--- Here is the request, as reported by Firebug:

{"searchQueryString":"p+9-n+12-c+287464-s+0-r+-t+-ri+-ni+1-x+"}

--- Full request headers as reported by Firebug:
Host: www.tiffany.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US;
rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Referer:
http://www.tiffany.com/Shopping/CategoryBrowse.aspx?cid=287464&mcat=148204
Content-Length: 84
Cookie: assortmentid=101; hascookies1=1;
__utma=124393999.990367556.1258838771.1258838771.1258842033.2;
__utmc=124393999;
__utmz=124393999.1258838771.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none);
s_cc=true;
s_sq=tiffanyrus%3D%2526pid%253DTiffany%252520%252526%252520Co.%252520%25257C%252520Browse%252520Earrings%2526pidt%253D1%2526oid%253Djavascript%25253AhandlePageRight%252528%252529%25253B%2526ot%253DA;
s_vi=[CS]v1|25842D7985010E69-4000010E8017E5DD[CE]; samebrowsersession=;
previoussid=; _UrlReferrer==http%3A//
www.tiffany.com/Shopping/CategoryBrowse.aspx%3Fcid%3D288188%26mcat%3D148206%23p+1-n+12-cg+viewPaged-c+288188-s+5-r+101287458-t+-ri+-ni+1-x+-pu+-f+;
__utmb=124393999.54.8.1258844027232
Pragma: no-cache
Cache-Control: no-cache






reply via email to

[Prev in Thread] Current Thread [Next in Thread]