bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] wget


From: Howard Bryden
Subject: [Bug-wget] wget
Date: Fri, 27 Apr 2012 14:25:44 +1000

Folks,

I'm using wget 1.13.4 to attempt to recursively download a Sharepoint site.  
The commandline is just the wget command verb; the contents of ~/.wgetrc are:

bnecso02:/root # cat ~/.wgetrc
continue              = on
user                  = desqld\hdbryden
password              = ***
cut_dirs              = 3
no_parent             = on
use_server_timestamps = on
add_hostdir           = off
input                 = ~/wgets
logfile               = /tmp/cso_wget.log
mirror                = on
remove_listing        = on
bnecso02:/root #

The contents of the input file wgets are:

bnecso02:/root # cat ~/wgets
http://bneapp13/QFRS/Community Safety Operations/BFS/3  South West/
bnecso02:/root #


Initially all appeared to work as expected yet it turns out I'm receiving only 
a subset of the filespace, namely

a) only the first 100 directories are visited, and
b) only the first 100 files from each directory are actually downloaded.

This pretty much corresponds to the Internet Explorer view, which presents the 
site in pages of 100 items (directories and files within directories).


The complete list of the directory space (as per DIR /S /ON) are attached:
 <<dir.log.gz>> 

The copied directory space on the Unix server is:

 <<dir.txt.gz>> 
>From these two listings, we see that

- only 100 out of 224 directories under 3 South West are captured, and
- only 100 out of 249 files in the subdirectory 3 South West/20090178/Assess 
were copied across.


The debug output (wget -d) is also attached, but I'm not sure it sheds a whole 
lot of light on the behaviour:

 <<cso_wget.log.gz>> 

Environment details:

uname -r -s:            HP-UX 11.31, March 2011 DCOE
uname -m:               ia64


wget details:

Package name:     wget
Version number:   1.13.4
Original author:  Hrvoje Niksic <address@hidden>
Original URL:     ftp://ftp.mirrorservice.org/pub/gnu/wget/
HP-UX URL:        
http://hpux.connect.org.uk/hppd/cgi-bin/search?package=&term=/wget-
License:          GNU General Public License v3
Languages:        C
Build-time deps:  db expat gdbm gettext gmp gnutls libgcrypt libgpg_error 
libiconv libidn libtasn1 lzo make nettle openssl p11_kit perl readline termcap 
zlib
Run-time deps:    db expat gdbm gettext gmp gnutls libgcrypt libgpg_error 
libiconv libidn libtasn1 lzo nettle openssl p11_kit perl readline termcap zlib
Install tree:     /usr/local
Report bugs to:   address@hidden
Tested on:        HP rp3410 running HP-UX 11.11 and 11.23,
                  HP rx2600 running HP-UX 11.23,
                  HP rp3440 running HP-UX 11.31 and
                  HP rx2620 running HP-UX 11.31
Compilers used:   PA-RISC - B.11.11.20 (HP C)
                  Itanium - A.06.25.02 (HP C)
LDOPTS setting:   export LDOPTS="+s -L/usr/local/lib -L/usr/lib"


                                                                          
Rocket J. Squirrel: "... we're going to have to think!"
Bullwinkle J. Moose: "There must be an easier way than that."

HOWARD BRYDEN
                                                                           
Senior Unix Administrator
Data Centre
Information and Communication Systems 
Corporate Support Division
Department of Community Safety
                                                                           
PHONE: 07 3635 3087 
POSTAL: GPO Box 1425, Brisbane, QLD 4001 | EMAIL: address@hidden
P Please consider the environment before printing this email - then print it

This correspondence is for the named persons only. It may contain confidential 
or privileged information or both. No confidentiality or privilege is waived or 
lost by any mis transmission. If you receive this correspondence in error 
please delete it from your system immediately and notify the sender. You must 
not disclose, copy or relay on any part of this correspondence, if you are not 
the intended recipient. Any opinions expressed in this message are those of the 
individual sender except where the sender expressly, and with the authority, 
states them to be the opinions of the Department of Community Safety, 
Queensland.

All reasonable precautions will be taken to respect the privacy of individuals 
in accordance with the Information Privacy Act 2009 (Qld). Details on how 
personal information may be used or disclosed by the Department of Community 
Safety, Queensland are available from 
www.communitysafety.qld.gov.au/info/privacy.htm

Attachment: dir.log.gz
Description: dir.log.gz

Attachment: dir.txt.gz
Description: dir.txt.gz

Attachment: cso_wget.log.gz
Description: cso_wget.log.gz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]