[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug#33432] On tags
From: |
Ludovic Courtès |
Subject: |
[bug#33432] On tags |
Date: |
Wed, 21 Nov 2018 11:15:52 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) |
Hello,
Ludovic Courtès <address@hidden> skribis:
> When downloading over SWH, the ‘swh-download’ procedure first resolves
> the tag (if it’s a tag), then tries to download the corresponding tarball
Speaking of tags, it’s not news but tags are bad from a reproducibility
standpoint: they are mutable and per-repository. Tag lookup is
necessarily relative to a repository URL (and to a snapshot of the
repository, since it can be mutated):
scheme@(guile-user)> (lookup-origin-revision
"https://git.savannah.gnu.org/git/guix.git" "v0.15.0")
$5 = #<<revision> id: "359fdda40f754bbf1b5dc261e7427b75463b59be" date: #<date
nanosecond: 0 second: 39 minute: 16 hour: 22 day: 5 month: 7 year: 2018
zone-offset: 7200> directory: "27c69c5d298a43096a53affbf881e7b13f17bdcd"
directory-url: "/api/1/directory/27c69c5d298a43096a53affbf881e7b13f17bdcd/">
So if, say, SWH archived a mirror of
<https://git.savannah.gnu.org/git/guix.git> but not
<https://git.savannah.gnu.org/git/guix.git> itself, then tag lookup will
fail, which is sad given that the code is actually there.
To address this, possible options include:
1. Always store commit IDs rather than tags, effectively giving us
“normal” Git content-addressability. This is not great for
code readability and review though.
2. Store ‘sha1_git’ hashes (SHA1s of Git trees) instead of or in
addition to nar sha256 hashes so we can perform lookups by content
hash on SWH or Git mirrors.
#2 might be the best long-term option though it would require daemon
support to compute, store, and check these Git-style hashes.
Ludo’.