lilypond-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using 'libfaketime' for reproducible builds


From: Jonas Hahnfeld
Subject: Re: Using 'libfaketime' for reproducible builds
Date: Mon, 28 Dec 2020 10:25:51 +0100
User-agent: Evolution 3.38.2

Am Sonntag, dem 27.12.2020 um 22:24 +0100 schrieb Werner LEMBERG:
> > Intercepting syscalls (or whatever the library does, I didn't
> > check) doesn't sound like the right approach outside of testing
> > reproducibility.
> 
> Why?  It's even less intrusive than the `SOURCE_DATE_EPOCH` solution.

I definitely consider intercepting various syscalls by means of
LD_PRELOADing more intrusive than setting a single environment variable
that was invented for the purpose of setting timestamps. Just think of
a new shiny syscall that might add a new source of non-reproducibility.

> > The larger "issue" with this topic seems to be LilyPond's
> > dependencies, in particular Ghostscript.  A contribution to add
> > support for above variable was closed as WONTFIX:
> > https://bugs.ghostscript.com/show_bug.cgi?id=696765
> 
> 
> Exactly.  In particular it means that we had to use the patched
> Debian version of ghostscript for reproducibility if we go the
> `SOURCE_DATE_EPOCH` route – and check which other distributions
> provide something similar.  I consider this as a very hacky 
> solution. On the other hand, intercepting the time syscalls is a
> completely transparent and clean solution.
> 
> BTW, the next version 'libfaketime' will allow to intercept
> `getrandom`, which means that we probably can 'fix' the `/ID` issue
> in PDF files generated by gs, too.
> 
> > I think that's a pity, but nothing we can change as a
> > "consumer" of library functions.
> 
> Exactly.  As long as we don't change LilyPond to produce PDFs by
> itself – which is a huge undertaking that I certainly won't start – 
> I think we have no other choice than using something like
> 'libfaketime' or a patched gs version.  I definitely prefer the
> former.

What I wanted to say is that we cannot change the developers' minds to
support the environment variable. But we can (and IMHO should) use all
available interfaces if we care about reproducibility. I see at least
two more options:

1) Strip non-determinism from the generated PDF. This is even mentioned
at https://reproducible-builds.org/docs/timestamps/ - before discussing
libfaketime which spends more than half of the paragraph mentioning
possible issues.

2) As we control the input PS code, we don't have to worry about the
operators that get the current time, draw a random number, etc. (as
long as we don't use them ourselves). Instead the bug linked above says
we just need to tell GS which CreationDate and ModDate to use (via
PDFmarks) and this should be straight-forward to fill with values
depending on SOURCE_DATE_EPOCH.
This probably leaves the UUIDs (is that the issue you mention above?)
which can be overridden using -sDocumentUUID and -sInstanceUUID.
Setting a constant time using libfaketime will result in the same UUID
for all generated PDFs, so it can't get worse; but I think it would be
desirable to do better than that and compute a "unique" ID based on the
input file, maybe as simple as the hash of the file path. It must be
considered that different values will prevent reuse of the GS API
instance, but I'd argue that a constant value should be fine in this
case.

Jonas

Attachment: signature.asc
Description: This is a digitally signed message part


reply via email to

[Prev in Thread] Current Thread [Next in Thread]