[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Rproducibility for Python and beyond
From: |
Simon TOURNIER |
Subject: |
Rproducibility for Python and beyond |
Date: |
Fri, 14 Apr 2023 10:43:37 +0000 |
Hi Konrad, all,
French speakers, here is an interesting presentation by Konrad about the state
of Python for scientific computing and reproducibility.
https://reproducibility.gricad-pages.univ-grenoble-alpes.fr/web/presentation_110423.html#presentation_110423
Without watching the video, here the questions I would like to discuss. :-)
1. Considering the Konrad's schema of some scientific computation (Model
--technical choices--> Code --computational env--> Results), there are also
technical choices about the computational environment, but they are implicit.
And often impossible to scrutinize because of the lack of transparency. The
key, IMHO, is not the determinism of the computation, instead the key is its
transparency. Determinism is one mean to obtain transparency and determinism
is not the only mean. For instance, this determinism is not affordable for
very intensive computation, where is not doable to repeat. How to think about
determinism considering statistical training of machine learning models? Other
said, for some cases, the "compilation" (Code -> Results) of the scientific
model is too costly.
2. The "redo" of computations is only possible when the citation is correct.
L'Inria is somehow proposing <https://hal.science/hal-02135891> with the
BibLaTeX style
<https://mirrors.ircam.fr/pub/CTAN/macros/latex/contrib/biblatex-contrib/biblatex-software/software-biblatex.pdf>.
However, this only captures, at best, some technical choices when
implementing the model. And this does not capture at all the complete
computational environment. What are your ideas for tackling this issue about
the citation?
For instance, the file "guix describe -f channels" is one mean for capturing
(and cite too!) one computational environment. Do we need to make it more
popular? How to link this mean with the archiving part of source code (relying
on SWH, say)?
Cheers,
simon
- Rproducibility for Python and beyond,
Simon TOURNIER <=