[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Metalink support

From: Hubert Tarasiuk
Subject: [Bug-wget] Metalink support
Date: Wed, 24 Jun 2015 17:06:35 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

I have prepared a Metalink patch for Wget. The two main features are:

- support Metalink v3 and Metalink v4 XML files:
using libmetalink parser: https://launchpad.net/libmetalink

- support Metalink in HTTP headers: http://tools.ietf.org/html/rfc6249

Specifically what the patch implements (for both XML and HTTP):
- keep downloading from consecutive resource URLs until a successful
- verify SHA256 digest (this digest is mandatory for Metalink, thus it
should be included in all Metalink documents)
- verify OpenPGP signatures (using public keys from user's keyring)
using GPGME and GnuPG

Checksum mismatch means download failure and it proceeds to try with
another resource (if available).
If the signature cannot be verified (missing public key would be the
most common reason), we do NOT assume download failure.
If the signature can be verified and the verification fails (ie. data
does not match signature), we assume download failure.

Please note that the PGP signatures are only working for
Metalink-over-HTTP at this time due to a bug in libmetalink.

Following options were added to Wget:

--input-metalink=FILE - download files described in Metalink file FILE
(like --input-file)

--metalink-over-http - when downloading from HTTP URLs:
-> issue a HEAD request and check for Metalink metadata in reponse
-> if found: switch to Metalink-mode
-> if not found: fall back to ordinary HTTP download

_Test suite_
I have made two modifications to Python test suite:
- allow multiple SendHeaders with same name by using a Python list as
dictionary value
- do not start the HTTP test in constructor; do it in the begin() method
instead (as the method name would suggest); original behaviour was to
run the test in object constructor and the begin() method would just
return the result

Please let me know what do you think about the patches. Some test cases
are included. If you would like to test it on actual servers, here is
what I found:
- Metalink files with PGP signatures: http://curl.haxx.se/download.html
- Metalink in HTTP headers:

The commits are also available via Github interface:


W dniu 28.05.2015 o 00:49, Hubert Tarasiuk pisze:
> I have talked with Giuseppe and he suggested that we might not do TCP
> Fast Open support for FTP at this time (he argued that FTP is slow
> either way :).
> Instead I might focus on implementing some basics of Metalink protocol
> for HTTP and FTP resources in Wget.
> Do you have any thoughts about that?

Attachment: 0001-Metalink-support.patch
Description: Text Data

Attachment: 0002-Only-start-HTTP-test-only-when-calling-begin.patch
Description: Text Data

Attachment: 0003-Test-case-for-Metalink-in-XML.patch
Description: Text Data

Attachment: 0004-Support-multiple-headers-with-same-name-in-Python-te.patch
Description: Text Data

Attachment: 0005-Test-case-for-Metalink-over-HTTP.patch
Description: Text Data

Attachment: 0006-Unit-test-for-find_key_value.patch
Description: Text Data

Attachment: 0007-Unit-test-for-has_key.patch
Description: Text Data

Attachment: 0008-Unit-test-for-find_key_values.patch
Description: Text Data

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]