[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Wget cannot get same page as browser

From: Giuseppe Scrivano
Subject: Re: [Bug-wget] Wget cannot get same page as browser
Date: Wed, 22 Jun 2011 16:21:15 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

It seems the server is looking at the user agent in the HTTP request.

Spoofing the User-Agent header seems to do the trick:

wget --user-agent="Mozilla/5.0 (X11; Linux i686; rv:2.0.1) Gecko/20110503 
IceCat/4.0.1" \
    -O 1567651151.html

You can find more information about --user-agent in the wget texinfo
manual (http://xkcd.com/912/). 


Gary Yang <address@hidden> writes:

> I use wget to retrieve links. However, the page I got with “wget” is
> different than the page I got from the browser. To debug it, I copied
> and pasted the link below to the browser’s address bar. Then, I view
> the HTML source code from browser. I searched the keyword,
> offer-listing. I found nine of them.
> Below is one of nine keyword offer-listing I found:
> <div class="mbcOlpLink"><a class="buyAction" 
> href="/gp/offer-listing/1567651151/ref=dp_olp_all_mbc?
> Below is the URL:
> http://www.amazon.com/Vocabulary-School-Student-Norman-Levine/dp/1567651151
> The command below saved result to the file, “1567651151”. But, I
> cannot find any “offer-listing” in it. The page got by wget is
> different than the browser with the same URL. What was wrong?
> wget 
> http://www.amazon.com/Vocabulary-School-Student-Norman-Levine/dp/1567651151
> Thanks,
> Gary

reply via email to

[Prev in Thread] Current Thread [Next in Thread]