bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Implementation suggestion for JavaScript execution


From: Giuseppe Scrivano
Subject: Re: [Bug-wget] Implementation suggestion for JavaScript execution
Date: Mon, 26 May 2014 15:20:19 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Andrew Pennebaker <address@hidden> writes:

> Tumblr and other websites delay loading some of their content (images)
> through JavaScript events like *onload*. It would be nice if wget supported
> a *-j* flag for executing this, in order to access these dynamically loaded
> resources. Execution may add some time to downloads, but for users that
> really want the content, having the option is better than not.
>
> Possible solutions:
>
> The HtmlUnit <http://htmlunit.sourceforge.net/> library can already do
> this, but it's written in Java and I believe wget is written in C?

correct, wget is written in C.


> Another consideration for attaching JS execution to wget is
> Node<http://nodejs.org/>, a
> C++ implementation, though we probably only want the core, the
> V8<https://code.google.com/p/v8/>JavaScript engine itself.
>
> Other possibilities include
> SpiderMonkey<http://en.wikipedia.org/wiki/SpiderMonkey_(JavaScript_engine)>,
> the JS engine for Firefox, or
> JavaScriptCore<http://www.webkit.org/projects/javascript/>,
> Safari's JS engine.

how would you programmatically retrieve these links?  Triggering
"onload" or other events?  I wonder how many of these occurrences we can
cover by simply trying to parse cases like document.location='foo'
without involving any JS engine.

Giuseppe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]