bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Problem downloading pages


From: Giuseppe Scrivano
Subject: Re: [Bug-wget] Problem downloading pages
Date: Tue, 01 Jun 2010 00:06:22 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

it can't be done with a single call to wget but you need a script.  This
shell function can help you to get the desired pdf file.

function download_article
{
    until fgrep "POST" $1.html; do
        wget -O $1.html --keep-session-cookies \
            --save-cookies=cookies.$1  --load-cookies=cookies.$1 \
            "http://archivio.lastampa.it/LaStampaArchivio/servlet/CreaPdf?ID=$1";
        sleep 2s
    done

    wget --post-data="" -O $1.pdf --keep-session-cookies \
        --save-cookies=cookies.$1  --load-cookies=cookies.$1 \
        "http://archivio.lastampa.it/LaStampaArchivio/servlet/CreaPdf?ID=$1";

    rm $1.html
    rm cookies.$1
}

# Call the function
download_article 1050435


Cheers,
Giuseppe



"Non scrivetemi" <address@hidden> writes:

> Hi,
> could you please tell me how can I download these pages with wget?
>
> http://archivio.lastampa.it/LaStampa...Pdf?ID=1050435
> http://archivio.lastampa.it/LaStampa...Pdf?ID=1050435
> http://archivio.lastampa.it/LaStampa...Pdf?ID=1129534
> .
> .
> .
>
> If I try to download them I get a "Pdf creation in progress page", not the 
> real pdf! 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]