bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] WARC output


From: Giuseppe Scrivano
Subject: Re: [Bug-wget] WARC output
Date: Wed, 10 Aug 2011 10:57:24 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

Gijs van Tulder <address@hidden> writes:

> It would be cool if Wget could become one of these tools. Already the
> Swiss army knife for mirroring websites, the one thing that Wget is
> missing is a good way to store these mirrors. The current output of
> --mirror is not sufficient for archival purposes:

Sure we do!



> With some help from others, I've added WARC functions to Wget. With
> the --warc-file option you can specify that the mirror should also be
> written to a WARC archive. Wget will then keep everything, including

Can you please track all contributors?  Any contribution to GNU wget
requires copyright assigments to the FSF.



> Do you think this is something that could be included in the main Wget
> version? If that's the case, what should be the next step?

Sure, I will take a look at the code in the next days.  In the
meanwhile, can you check if you are following the GNU Coding Standards
for the new code[1]?



> The implementation makes use of the open source WARC Tools library
> (Apache License 2.0):
>  http://code.google.com/p/warc-tools/

how much code is really needed from that library?  I wonder if we can
avoid this dependency at all.

Cheers,
Giuseppe



1) http://www.gnu.org/prep/standards/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]