[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performance of computing cross derivations
From: |
Christopher Baines |
Subject: |
Re: Performance of computing cross derivations |
Date: |
Wed, 10 Jan 2024 12:40:07 +0000 |
User-agent: |
mu4e 1.10.7; emacs 29.1 |
Efraim Flashner <efraim@flashner.co.il> writes:
> [[PGP Signed Part:Signature made by expired key 41AAE7DCCA3D8351 Efraim
> Flashner <efraim@flashner.co.il>]]
> On Fri, Jan 05, 2024 at 04:41:14PM +0000, Christopher Baines wrote:
>>
>> Ludovic Courtès <ludo@gnu.org> writes:
>>
>> > Hi,
>> >
>> > Christopher Baines <mail@cbaines.net> skribis:
>> >
>> >> When asked by the data service, it seems to take Guix around 3 minutes
>> >> to compute cross derivations for all packages (to a single
>> >> target). Here's a simple script that replicates this:
>>
>> ...
>>
>> > One idiom that defeats caching is:
>> >
>> > (define (make-me-a-package x y z)
>> > (package
>> > …))
>> >
>> > Such a procedure returns a fresh package every time it’s called,
>> > preventing caching from happening (because cache entries are compared
>> > with ‘eq?’). That typically leads to lower hit rates.
>> >
>> > Anyway, lots of words to say that I don’t see anything immediately
>> > obvious with cross-compilation, yet I wouldn’t be surprised if some of
>> > these cache-defeating idioms were used because we’ve payed less
>> > attention to this.
>>
>> I've got a feeling that performance has got worse since I looked at this
>> originally, I've finally got around to having a further look.
>>
>> I spent some time looking at various metrics, but it was most useful to
>> just write the cache keys of various types to files and have a read.
>>
>> The cross-base module was causing many issues, as all but one of the
>> procedures there produced new package records each time. There is also
>> make-rust-sysroot which showed up.
>>
>> I've sent some patches as #68266 to add memoization to avoid this, and
>> that seems to speed things up.
>>
>> Looking at other things in the cache, I think there are some issues with
>> file-append and local-file. The use of file-append in svn-fetch and
>> local-file in the lower procedure in the python build system both bloat
>> the cache for example, although I'm less sure about how to address these
>> cases.
>>
>> One thing I am sure about though, is that these problems will come
>> back. Maybe we could add some reporting in to Guix to look through the
>> cache at the keys, lower them all and check for equivalence. That way it
>> should be possible to automate saying that having [1] in the cache
>> several thousand times is unhelpful. The data service could then run
>> this reporting and store it.
>>
>> 1: #<file-append #<package subversion@1.14.2
>> gnu/packages/version-control.scm:2267 7f294d908840> "/bin/svn">
>
> I grabbed the patch for make-rust-sysroot to try it out:
> Native builds:
> time GUIX_PROFILING="object-cache" ./pre-inst-env guix build --no-grafts
> $(./pre-inst-env ~/list-all-cargo-build-system-packages | grep rust- | head
> -n 100) -d
...
> That's a massive drop in the size of the cache and a big decrease in the
> amount of time it took to calculate those 100 items.
I think you're right, while I send some other changes in #68266, I think
it's this change around make-rust-sysroot that has pretty much all the
effects on performance.
I think the tens of thousands of duplicated packages from cross-base
that I was looking at are almost entirely coming from
make-rust-sysroot. As Ludo mentions in [1], maybe this has something to
do with use of cross- procedures in native-inputs, although I'm not sure
that moving those calls out of native-inputs is a correct thing to do.
I don't know what the correct approach here is, but I think something
needs doing here to address the performance regression.
1: https://lists.gnu.org/archive/html/guix-patches/2024-01/msg00733.html
signature.asc
Description: PGP signature