wp-mirror-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Wp-mirror-list] Download issue with wp-mirror


From: wp mirror
Subject: [Wp-mirror-list] Download issue with wp-mirror
Date: Thu, 21 Mar 2013 18:37:55 -0400

Dear Benjamin,

Good to read from you again.

0) Hardware.

Currently the disk space required for the `enwiki' images is 2.2T and
growing.  A 2T RAID array is no longer adequate.

1) assert-hdd-write-cache-disabled-p check.

You may disable this without collateral issues.  In fact, in the next
release, WP-MIRROR 0.6, it will be user configurable, with the default
setting being to not disable the hdd write caches.

2) Downloading `enwiki-20130304-pages-articles.xml.bz2'

WP-MIRROR 0.5 use `cURL' to download this file.  Unfortunately, `cURL'
often gives up before downloading the entire file.  The larger the
file, the more likely this failure mode.  `wget' has an automatic
continuation feature that overcomes this problem.  WP-MIRROR 0.6 will
use `wget' for large files.

3) Scraping image file names

The function `fsm-file-wikix' scrapes an `xchunk' for images file
names, and then generates a corresponding `ichunk'.  The error message

     regexp:regexp-exec: argument eof is not a string

indicates that the regular expression matcher was somehow fed an empty
string.  This is a bug.  I will try to fix it in WP-MIRROR 0.6.  You
may step over this bug by executing:

shell> mysql --host=localhost --user=root --password
...
mysql> update wpmirror.file set state='done' where
name='enwiki-20130304-pages-articles-p005974000-c000001000.xml';
Query OK, 1 row affected (0.10 sec)
mysql> down;

rootshell> wp-mirror --mirror

Sincerely Yours,
Kent



reply via email to

[Prev in Thread] Current Thread [Next in Thread]