[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Conda environments and reproducibility
From: |
Konrad Hinsen |
Subject: |
Re: Conda environments and reproducibility |
Date: |
Tue, 29 Nov 2022 14:39:46 +0100 |
Hi Hugo,
Buddelmeijer <hugo@buddelmeijer.nl> writes:
> Hi Konrad, Thibault and others,
>
> Konrad, is it perhaps possible for you to dig up this broken conda
> environment file?
Yes:
https://gist.github.com/brospars/4671d9013f0d99e1c961482dab533c57
That environment was set up in 2018 on a Linux machine, and then tested
under macOS and Windows as well. It broke in early 2019.
> First, just like you all, my conclusion is that guix is the answer. The
> last two paragraphs by Simon captures it succinctly. However, conda seems
> to work fine for most people. It would therefore be instructive to have
> concrete 'failure stories' in order to show people that conda is not enough.
I have heard many stories of conda failing long-term, i.e. environments
not being reproducible after a year or two. Most use cases are probably
more short-term.
> It doesn't seem common to overwrite conda binaries. Conda takes some (not
> enough?) measures to prevent the scenario Konrad describes. In particular,
> the filenames include a 'hash' since conda 3 (~2014) [1]:
Weird. We worked with official Miniconda downloads from early 2018, and
our environment files contain no hashes.
> My realization was that improving these hashes is a goose chase and will
> ultimately lead to horrific things like "turing-complete yaml files". And
> at that point it is clear, at least to me, that guix is the answer.
Indeed. Turing-complete Scheme files :-)
My conclusion so far is that conda can never attain long-term
reproducibility, because it wants to be multi-platform. And that means
that it doesn't control the foundations on which it has to build.
>From a user's point of view, a big problem with conda is the opacity of
the machinery, which in addition changes all the time as you say. With
Guix, I can understand how everything is built, and thus understand the
potential obstacles to a rebuild many years later. With conda, I don't
really know and my understanding is that the build machinery is not
even completely public (for Anaconda at least).
> One thing that conda (or actualy conda-forge) does well, are their bots.
> I'm a maintainer of some conda packages and once a month or so I get a
> fully automated pull request to update my package [4], e.g. when the
> upstream package is updated, or when a dependency is updated. They even
That's nice!
> packages, such as compilers. This makes maintaining conda-forge packages a
> breeze. Having such bots also within the guix-ecosystem would probably help
> attract developers.
Indeed. More generally, I think package managers should do a better job
in reaching out to upstream maintainers. They are our allies in
providing a better UX.
Cheers,
Konrad
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: konrad DOT hinsen AT cnrs DOT fr
http://dirac.cnrs-orleans.fr/~hinsen/
ORCID: https://orcid.org/0000-0003-0330-9428
Twitter: @khinsen
---------------------------------------------------------------------