[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] downloading links in a dynamic site
From: |
Keisial |
Subject: |
Re: [Bug-wget] downloading links in a dynamic site |
Date: |
Mon, 26 Jul 2010 20:18:27 +0200 |
User-agent: |
Thunderbird |
Vinh Nguyen wrote:
> Dear list,
>
> My goal is to download some pdf files from a dynamic site (not sure on
> the terminology). For example, I would execute:
>
> wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*'
> http://site.com/?sortorder=asc&p_o=0
>
> and would get my 10 pdf files. On the page I can click a "Next" link
> (to have more files), and I execute:
>
> wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*'
> http://site.com/?sortorder=asc&p_o=10
>
> However, the downloaded files are identical to the previous. I tried
> the cookies setting and referer setting:
>
> wget -U firefox --cookies=on --keep-session-cookies
> --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*'
> http://site.com/?sortorder=asc&p_o=0
> wget -U firefox --referer='http://site.com/?sortorder=asc&p_o=0'
> --cookies=on --load-cookies=cookie.txt --keep-session-cookies
> --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*'
> http://site.com/?sortorder=asc&p_o=10
>
> but the results again are identical. Any suggestions?
>
> Thanks.
> Vinh
Look at the page source how they are generating the urls.
Maybe they are using some ugly javascript, although that discards
the benefit of paging...