[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug#36630] [PATCH] guix: parallelize building the manual-database
From: |
Arne Babenhauserheide |
Subject: |
[bug#36630] [PATCH] guix: parallelize building the manual-database |
Date: |
Tue, 16 Jul 2019 01:32:46 +0200 |
User-agent: |
mu4e 1.2.0; emacs 26.2 |
Hi Ludo’,
Ludovic Courtès <address@hidden> writes:
>> * guix/profiles.scm (manual-database): par-map over the entries. This
>> distributes the load roughly equally over all cores and avoids blocking on
>> I/O. The order of the entries stays the same since write-mandb-database
>> sorts
>> them.
>
> I would think the whole process is largely I/O-bound. Did you try
> measuring differences?
I did not measure the difference in build-time, but I did check the
system load. Without this patch, one of my cores is under full
load. With this patch all 12 hyperthreads have a mean load of 50%.
> I picked the manual-database derivation returned for:
> guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
> (It has 3,046 entries.)
How exactly did you run the derivation? I’d like to check it if you can
give me the exact commandline to run (a command I can run repeatedly).
> On a SSD and with a hot cache, on my 4-core laptop, I get 74s with
> ‘master’, and 53s with this patch.
I’m using a machine with 6 physical cores, hyperthreading, and an NVMe
M.2 disk, so it is likely that it would not be disk-bound for me at 4
threads.
> However, it will definitely not scale linearly, so we should probably
> cap at 2 or 4 threads. WDYT?
Looking at the underlying action, this seems to be a task that scales
pretty well. It just unpacks files into the disk-cache.
It should also not consume much memory, so I don’t see a reason to
artificially limit the number of threads.
> Another issue with the patch is that the [n/total] counter does not grow
> monotically now: it might temporally go backwards. Consequently, at
> -v1, users will see a progress bar that hesitates and occasionally goes
> backward, which isn’t great.
It typically jumps forward in the beginning and then stalls until the
first manual page is finished.
Since par-map uses a global queue of futures to process, and since the
output is the first part of (compute-entry …), I don’t expect the
progress to move backwards in ways a user sees: It could only move
backwards during the initial step where all threads start at the same
time, and there the initial output should be overwritten fast enough to
not be noticeable.
> This would need to fix it with a mutex-protected global counter.
A global counter would be pretty bad for scaling. As it is, this code
needs no communication between processes besides returning the final
result, so it behaves exactly like a normal map, aside from being
faster. So I’d prefer to accept the forward-jumping.
> All in all, I’m not sure this is worth the complexity.
>
> WDYT?
Given that building manual pages is the most timeconsuming part when
installing a small tool into my profile, I think it is worth the
complexity. Especially because most of the complexity is being taken
care of by (ice-9 threads par-map).
Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein
ohne es zu merken
signature.asc
Description: PGP signature
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/12
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/15
- [bug#36630] [PATCH] guix: parallelize building the manual-database,
Arne Babenhauserheide <=
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/16
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/17
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/18
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/18
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/18
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/18
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/18