|
From: | Gijs van Tulder |
Subject: | [Bug-wget] Standards fix for metadata records in WARC files |
Date: | Fri, 12 Apr 2013 23:49:32 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 |
This patch repairs two minor problems in the WARC metadata records.1. Each record should have its own unique WARC-Record-ID, but currently the ID for the record holding the manifest is reused for the record holding the arguments. The patch generates a new ID for the arguments (and refers to the manifest in a WARC-Concurrent-To header).
2. According to the WARC implementation guidelines [1], the manifest should be written to a "metadata" record, but Wget stores it as a "resource" record. The patch corrects this.
Regards, Gijs[1] Section 2.4.4 of http://www.netpreserve.org/resources/warc-implementation-guidelines-v1
warc-metadata-standards-fix.patch
Description: Text Data
[Prev in Thread] | Current Thread | [Next in Thread] |