[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Add draft post "CRAN, a practical example for being reproduc
From: |
Ludovic Courtès |
Subject: |
Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix". |
Date: |
Tue, 13 Dec 2022 14:53:34 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Hello!
Lars-Dominik Braun <ldb@leibniz-psychology.org> skribis:
>> Applied, thanks. It is under drafts/ [1]. Last round proofread before
>> publishing. On Friday?
> Friday sounds good. I’m attching minor changes to the synax highlighting.
We missed one Friday but there are plenty coming up. :-)
As mentioned on #guix-hpc, I think it’d be interesting to add a
reference to https://www.nature.com/articles/s41597-022-01143-6 to
illustrate the rationale. I think it’s important because R users are
likely to wonder why they’d bother with Guix in the first place.
Here’s a proposal in that direction; feel free to take it, tear it down,
change it, or whatever.
Thanks,
Ludo’.
diff --git a/drafts/reproducible-cran.md b/drafts/reproducible-cran.md
index c691163..28f6108 100644
--- a/drafts/reproducible-cran.md
+++ b/drafts/reproducible-cran.md
@@ -60,6 +60,42 @@ pre-built substitutes to speed up installation times.
Additionally,
reproducing environments would include fewer steps if the package
recipes were available to anyone by default.
+## Why deploy R software with Guix anyway?
+
+At this point, perhaps you're wondering: R is stable, and tools such as
+[Packrat](https://rstudio.github.io/packrat/) let me save and restore
+the exact R package versions I need. While this might seem “good
+enough”, we can already tell this approach [has a number of
+shortcomings](https://hpc.guix.info/blog/2022/07/is-reproducibility-practical/),
+one of which being that it cannot handle dependencies not written in
+R—such as R itself.
+
+A [study published in *Nature Scientific Data* in February
+2022](https://doi.org/10.1038/s41597-022-01143-6) gives empirical
+insight into this:
+
+> _[We] retrieve and analyze more than 2000 replication datasets with
+> over 9000 unique R files published from 2010 to 2020. Second, we
+> execute the code in a clean runtime environment to assess its ease of
+> reuse. […] We find that 74% of R files failed to complete without
+> error in the initial execution, while 56% failed when code cleaning
+> was applied, showing that many errors can be prevented with good
+> coding practices._
+
+Three fourth of those R packages fail to run out of the box—this is
+huge. How did the authors re-execute this code?
+
+> _We re-executed R code from each of the replication packages using
+> three R software versions, R 3.2, R 3.6, and R 4.0, in a clean
+> environment._
+
+Despite this guesswork, coupled with automatic “source cleaning”, the
+authors found that most packages still fail to run.
+
+The motivation to deploy R software with Guix becomes clear: it’s the
+ability to automatically redeploy the same software environment, at
+different points in time, on different machines.
+
## Introducing guix-cran
GNU Guix provides a mechanism called “channels”,
- [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Lars-Dominik Braun, 2022/12/06
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Simon Tournier, 2022/12/06
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Lars-Dominik Braun, 2022/12/07
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix".,
Ludovic Courtès <=
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., zimoun, 2022/12/14
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Lars-Dominik Braun, 2022/12/16
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Ludovic Courtès, 2022/12/16
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Lars-Dominik Braun, 2022/12/17
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Simon Tournier, 2022/12/17
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Lars-Dominik Braun, 2022/12/19
- Re: [PATCH] Add draft post "CRAN, a practical example for being reproducible at large scale using GNU Guix"., Ludovic Courtès, 2022/12/21