On Mon, Apr 1, 2024, at 2:04 PM, Russ Allbery wrote:
"Zack Weinberg" <zack@owlfolio.org> writes:
It might indeed be worth thinking about ways to minimize the
difference between the tarball "make dist" produces and the tarball
"git archive" produces, starting from the same clean git checkout,
and also ways to identify and audit those differences.
There is extensive ongoing discussion of this on debian-devel. There's
no real consensus in that discussion, but I think one useful principle
that's emerged that doesn't disrupt the world *too* much is that the
release tarball should differ from the Git tag only in the form of
added files. Any files that are present in both Git and in the release
tarball should be byte-for-byte identical.
That dovetails nicely with something I was thinking about myself.
Obviously the result of "make dist" should be reproducible except for
signatures; to the extent it isn't already, those are bugs in automake.
But also, what if "make dist" produced *two* disjoint tarballs? One of
which is guaranteed to be byte-for-byte identical to an archive of the
VCS at the release tag (in some clearly documented fashion; AIUI, "git
archive" does *not* do what we want). The other contains all the files
that "autoreconf -i" or "./bootstrap.sh" or whatever would create, but
nothing else. Diffs could be provided for both tarballs, or only for
the VCS-archive tarball, whichever turns out to be more compact (I can
imagine the diff for the generated-files tarball turning out to be
comparable in size to the generated-files tarball itself).