guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Tooling for branch workflows


From: Andreas Enge
Subject: Tooling for branch workflows
Date: Wed, 10 May 2023 13:39:05 +0200

Hello all,

the title says it all, I wish to share some conclusions from working on
the core-updates merge. Clearly our tooling could be improved for the task;
there was some flying by night without instruments, and in the end I
merged the branch without being really able to tell how it compared
to master... (You may also blame it partially on my lack of patience.)
Having feature branches may or may not make things a bit easier, but it
will definitely not solve the problems.
This mail is also of course a bit politically sensitive: It may look like
I am complaining about other people's work, who are volunteers and do what
they can, without offering to work on the code myself. So as a preamble,
let me express my gratitude to the few people who have been working
tirelessly on our tooling and contributing to our infrastructure, without
whom big code changes like we did on core-updates (and now on feature
branches) would simply be impossible; their work is vital to the project
and often not very visible. If I am critical, it is not to diminish their
work, but to discuss about a positive path forward; and I hope more people
will find the motivation to do infrastructure work, which I think will be
decisive for the success of Guix (together with policy and organisational
questions).

We have two build farms, berlin and bordeaux (which is a good thing for
checking reproducibility and for redundancy, but maybe a bit of a problem
concerning hardware requirements for "exotic" architectures), running
two different CI projects, cuirass and the Guix build coordinator (gbc in
the following); both have a very low bus factor (1 to 2?), and it would be
nice to get more people onboard. For this, more documentation would be
helpful. Both have pros and cons, and are architectured quite differently,
so I do not know whether convergence is achievable.

I ended up relying mostly on cuirass for reasons I do not completely
remember any more. The dashboard with its green and red dots is a very
useful tool compared to lists of builds, which become unusable with over
20000 packages. The bigger build power on bordeaux is helpful, and I found
the web interface of gbc a bit slow and down a bit too often. With this
experience, I just filed three wishlist bugs for cuirass:
- Topological sorting in cuirass
  https://issues.guix.gnu.org/63412
  The lack of ordering the builds is a big problem wasting a lot of build
  power; it is solved in gbc and, I think, the reason why the bordeaux
  build farm fares better for aarch64 with fewer machines.
  I would tag this as "important".
- Evaluation comparison on cuirass
  https://issues.guix.gnu.org/63414
  Without being able to compare a branch to master, it is difficult to
  decide whether one should merge. This is sort of solved in gbc, but so
  far the bordeaux build farm has been used more for QA of single patches
  (or a short list of patches featuring in a single issue) than for building
  complete branches.
- Stop and restart builds in cuirass
  https://issues.guix.gnu.org/63413
  Manual intervention is not easy in cuirass (I spent hours clicking on
  "restart" or using the REST API with a shell script through wget, which
  resulted in my IP being banned as a DoS suspect...); and to my knowledge,
  there is no web interface for doing so in gbc. In both systems one can
  probably tinker with the underlying databases, but this also does not
  qualify as "easy".

gdb just got a very nice feature on "blocking builds":
   
https://data.guix.gnu.org/revision/8f92dfd9ae7ac491ab7fb4b425799a8c909708a8/blocking-builds?system=aarch64-linux&target=none&limit_results=50
As I understand them, these are the "first failures", derivations all
inputs of which are available, but which fail themselves; so they give
the place where work is needed (and repairs will immediately make
a difference). Once the topological sorting in cuirass is sorted out,
these should be the builds marked as "Failed" (as opposed to "Failed 
(dependency)"),
so with the first issue above handled, they could easily be shown by
cuirass as well.

This was a long message to say "I filed three bugs", but maybe it can be
the starting point to discuss more items on how to go forward with our
build and CI infrastructure.

Andreas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]