[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Wp-mirror-list] Download issue with wp-mirror
From: |
wp mirror |
Subject: |
[Wp-mirror-list] Download issue with wp-mirror |
Date: |
Thu, 21 Mar 2013 18:37:55 -0400 |
Dear Benjamin,
Good to read from you again.
0) Hardware.
Currently the disk space required for the `enwiki' images is 2.2T and
growing. A 2T RAID array is no longer adequate.
1) assert-hdd-write-cache-disabled-p check.
You may disable this without collateral issues. In fact, in the next
release, WP-MIRROR 0.6, it will be user configurable, with the default
setting being to not disable the hdd write caches.
2) Downloading `enwiki-20130304-pages-articles.xml.bz2'
WP-MIRROR 0.5 use `cURL' to download this file. Unfortunately, `cURL'
often gives up before downloading the entire file. The larger the
file, the more likely this failure mode. `wget' has an automatic
continuation feature that overcomes this problem. WP-MIRROR 0.6 will
use `wget' for large files.
3) Scraping image file names
The function `fsm-file-wikix' scrapes an `xchunk' for images file
names, and then generates a corresponding `ichunk'. The error message
regexp:regexp-exec: argument eof is not a string
indicates that the regular expression matcher was somehow fed an empty
string. This is a bug. I will try to fix it in WP-MIRROR 0.6. You
may step over this bug by executing:
shell> mysql --host=localhost --user=root --password
...
mysql> update wpmirror.file set state='done' where
name='enwiki-20130304-pages-articles-p005974000-c000001000.xml';
Query OK, 1 row affected (0.10 sec)
mysql> down;
rootshell> wp-mirror --mirror
Sincerely Yours,
Kent