bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Re: wget not completely downloading page


From: Dan Yamins
Subject: [Bug-wget] Re: wget not completely downloading page
Date: Sat, 14 Feb 2009 19:36:41 -0500

OK I've realized that the contents I want to get at are loaded dynamically via _javascript_ -- and that if do something like

   wget -nd -E -k -K -p http://www.nyse.com/about/listed/lc_ny_name_A.html

then I'll get a bunch of files, including the basic _javascript_ file

   lc_ny_name_A.js

This _javascript_ has a big hard-coded variable in it that specifies some of the basic information that I want -- so I could try to parse this _javascript_ in some way to get what I'm after.

However,  if I do "save file" from firefox, I get a static .html page with everything I really want to parse in it ...   Is there any way to simply have wget do something like what Firefox does -- so that I can actually just download the page _after_ the dynamic elements have been loaded and processed? Or is my thinking on this totally wrong?

Thanks
Dan


On Sat, Feb 14, 2009 at 6:01 PM, Dan Yamins <address@hidden> wrote:
Hi,

I've been having some trouble downloading several pages with wget -- for instance:

   http://www.nyse.com/about/listed/lc_all_name_F.html

This page is be downloaded  and displayed fine by firefox, etc... but when I try to wget it, I only get a small piece of the page.  

On the one hand, it looks like what may happening is that when I look at it on in the browser, most of the page data only downloads after a few seconds of waiting -- but that wget doesn't seem to wait long enough and closes the download before it is done.  

Or is this maybe a "user-agent" issue and the website I'm trying to download is trying to discriminate against systematic downloads?

Any help would be appreciated!

best
Dan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]