guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Disarchive update


From: zimoun
Subject: Re: Disarchive update
Date: Tue, 12 Oct 2021 11:19:18 +0200

Hi Ludo,

On Sat, 09 Oct 2021 at 12:05, Ludovic Courtès <ludovic.courtes@inria.fr> wrote:

> If you run:
>
>   guix build 
> /gnu/store/nnl67m8c2x9rwqbnych1agc6p7g5473g-disarchive-collection.drv

Oh, cool!

> and if you’re patient :-), you eventually get a 579 MB directory
> containing Disarchive metadata for 8,413 tarballs out of 9,113 (the
> missing tarballs are those that “disarchive disassemble” fails to
> handle, for instance because it couldn’t guess what compression method
> is being used.)

Timothy made this table months ago:

        tar+gz        9090  52.0%
        git           5294  30.3%
        tar+xz        1184  06.8%
        tar+bz2        775  04.4%
        tar            393  02.2%
        zip            273  01.6%
        svn-multi      175  01.0%
        svn            125  00.7%
        file            51  00.3%
        computed        38  00.2%
        hg              36  00.2%
        unknown-uri     20  00.1%
        tar+gz?         15  00.1%
        tar+lz          13  00.1%
        tar+Z            4  00.0%
        cvs              3  00.0%
        bzr              3  00.0%
        tar+lzma         2  00.0%
        total        17494 100.0%

What is really missing is XZ and Bzip2 support in Disarchive, I guess.


> Where to go from here?  Timothy Sample had already set up a Disarchive
> database at <https://disarchive.ngyro.com>, which (guix download) uses
> as a fallback; I’m not sure exactly how it’s populated.  The goal here
> would be for the Guix project to set up infrastructure populating a
> database automatically and creating backups, possibly via SWH (we’ll
> have to discuss it with them).

Timothy was working on feeding the database using each release.  Well,
you can give a look at:

<https://git.ngyro.com/preservation-of-guix>

Then something along these lines:

    $ sqlite3 /tmp/pog.db < schema.sql
    $ guix repl -L . <(echo '
          (use-modules (pog))
          (ingest "6298c3ffd9654d3231a6f25390b056483e8f407c"
                  "/tmp/pog.db")
      ')

for where the commit hash corresponds to v1.0.0.  I do not know if it
would be equivalent to run:

   guix time-machine --commit=6298c3ffd9654d3231a6f25390b056483e8f407c \
        -- build -m etc/disarchive-manifest.scm


> A plan we can already deploy would be:
>
>   1. Add the disarchive.guix.gnu.org DNS entry, pointing to berlin.
>
>   2. On berlin, add an mcron job that periodically copies the output of
>      the latest “disarchive-collection” build to a directory, say
>      /srv/disarchive.  Thus, the database would accumulate tarball
>      metadata over time.
>
>   3. Add an nginx route so that /srv/disarchive is served at
>      https://disarchive.guix.gnu.org.
>
>   4. Add disarchive.guix.gnu.org to (guix download).

To replace (or add to) the current ’%disarchive-mirrors’ right?

Going this road (use Cuirass), why not generating the sources.json
similarly?   Instead of the hack using the website builder.


On my side, I will try to resume what I started months ago: knowing the
SWH coverage.  For instance, on this ~92% of tarballs, how many are
currently stored into SWH?  Well, do not take your breath and I would be
happy if someone beats me. ;-)


Cheers,
simon



reply via email to

[Prev in Thread] Current Thread [Next in Thread]