bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Bug-wget] Problem, no getting any response


From: Tony Lewis
Subject: RE: [Bug-wget] Problem, no getting any response
Date: Sat, 21 Nov 2009 20:20:38 -0800

There are several things about the request you're asking wget to send that
don't match the browser's request.

Let's start with the most obvious: your posted data looks nothing like what
the browser is sending. According to your Firebug output, the data posted
is:
{"searchQueryString":"p+9-n+12-c+287464-s+0-r+-t+-ri+-ni+1-x+"}

Other things that might matter to the server:
- the user agent (many servers reject web crawling software such as wget)
- the content type (Firefox is sending application/json)
- referer
- cookies

Most of these things you can work around with appropriate settings to wget,
but I'm not aware of any way to override the content type.

Run wget with --debug and compare what wget is sending to what Firebug
reports. The closer you can get wget's request to the Firefox request, the
more likely it is to work.

Good luck.
-----Original Message-----
From: address@hidden
[mailto:address@hidden On Behalf Of Dan Yamins
Sent: Saturday, November 21, 2009 3:53 PM
To: address@hidden
Subject: [Bug-wget] Problem, no getting any response

Hi,

I'm trying to use wget to scrape some data from a page that requires a
posting of some data (the page itself does it via Javascript).   When I use
the command:

$ wget --header="Content-length:84"
--post-data="searchQueryString=p-8-n+12-cg+viewPaged-c+287464-s+0-r+-t+-ri+-
ni+1-x+-pu+-f+"
http://www.tiffany.com/Shopping/CategoryBrowse.aspx/GetCategoriesXmlBySearch
QS-O
test.html

.... I never get a response and wget hangs.

My question is, even though I'm sending the exact same post as the browser
does when I view the page in Firefox (I looked at it in firebug), I guess I
must not be sending something right.  I've tried mimicking everything in the
request header, but no matter what, I always get the hang.

Is there something else I can do?  Something obvious I'm doing wrong?  (Am I
not posting the xml properly?)

Thanks!
Dan



--- Here is the request, as reported by Firebug:

{"searchQueryString":"p+9-n+12-c+287464-s+0-r+-t+-ri+-ni+1-x+"}

--- Full request headers as reported by Firebug:
Host: www.tiffany.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US;
rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Content-Type: application/json; charset=utf-8
Referer:
http://www.tiffany.com/Shopping/CategoryBrowse.aspx?cid=287464&mcat=148204
Content-Length: 84
Cookie: assortmentid=101; hascookies1=1;
__utma=124393999.990367556.1258838771.1258838771.1258842033.2;
__utmc=124393999;
__utmz=124393999.1258838771.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none
);
s_cc=true;
s_sq=tiffanyrus%3D%2526pid%253DTiffany%252520%252526%252520Co.%252520%25257C
%252520Browse%252520Earrings%2526pidt%253D1%2526oid%253Djavascript%25253Ahan
dlePageRight%252528%252529%25253B%2526ot%253DA;
s_vi=[CS]v1|25842D7985010E69-4000010E8017E5DD[CE]; samebrowsersession=;
previoussid=; _UrlReferrer==http%3A//
www.tiffany.com/Shopping/CategoryBrowse.aspx%3Fcid%3D288188%26mcat%3D148206%
23p+1-n+12-cg+viewPaged-c+288188-s+5-r+101287458-t+-ri+-ni+1-x+-pu+-f+;
__utmb=124393999.54.8.1258844027232
Pragma: no-cache
Cache-Control: no-cache





reply via email to

[Prev in Thread] Current Thread [Next in Thread]