guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Substitute retention


From: Ludovic Courtès
Subject: Substitute retention
Date: Tue, 12 Oct 2021 18:04:25 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hi!

(Moving to guix-devel from <https://issues.guix.gnu.org/42162#43>.)

zimoun <zimon.toutoune@gmail.com> skribis:

>> For the record, the ‘guix publish’ config on berlin is here:
>>
>>   
>> https://git.savannah.gnu.org/cgit/guix/maintenance.git/tree/hydra/modules/sysadmin/services.scm#n485
>>
>> If I read that correctly, nars have a TTL of 180 days (this is the time
>> a nar is retained after the last time it has been requested, so it’s a
>> lower bound.)

[...]

> Just for the record, a back to envelope computations.  180 days before
> today was April 15th (M-x calendar C-u 180 C-b).  It means 6996 commits
> (35aaf1fe10 is my current last commit).
>
>     git log --format="%cd" --after=2021-04-15 | wc -l
>     6996
>
> However, these commits are pushed by batch.  Roughly, it reads:
>
>     git log --format="%cd" --after=2021-04-15 --date=unix \
>         | awk 'NR == 1{old= $1; next}{print old - $1; old = $1}' \
>         | sort -n | uniq -c | grep -e "0$" | head
>           1 -1542620
>        3388 0
>          14 10
>           6 20
>           5 30
>           2 40
>           4 50
>           1 60
>           2 70
>           2 80
>
> (Take the ’awk’ with care, I am not sure of what I am doing. :-)  And,
> it is rough because timezone etc.)
>
> Other said 3388/6996= ~50% of commits are pushed at the same time, i.e.,
> missed by both build farms using 2 different strategies to collect the
> thing to build (fetch every 5 minutes or fetch from guix-commits).  It
> is a quick back to envelope so keep that with some salt. :-)

OK.

> On that number, after 180 days (6 months), it is hard to evaluate the
> rate of the time-machine queries.  And from my experience (no number to
> back), running time-machine on a commit older than this 180 days implies
> to build derivations.  Or it is a lucky day. :-)

Right.

So what can we do to address this issue?  I *think* we could use a
higher TTL on berlin, and we can try that right away (9 months to being
with?).

However, there is an upper bound anyway.  To make informed decisions on
the retention policy, we should monitor storage space on berlin/bayfront
to better estimate what can be done.  We have Zabbix but it’s not
accessible from the outside; maybe we could graph storage space
somewhere so people can grab the data and work on those estimates?

What if we decide that we need to provide substitutes for 2y old
commits?  In that case, we need a plan to scale up.  That could be
renting storage space somewhere.  That’s largely non-technical work that
needs attention.

There are also technical tweaks that could help: distinguishing between
“important” substitutes that we want to keep, and less important
substitutes (how?); identifying “equivalence classes” for builds of a
given package; etc.  The outcome is unclear and it’ll take time.

Thoughts?

Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]