[bug#36630] [PATCH] guix: parallelize building the manual-database

guix-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#36630] [PATCH] guix: parallelize building the manual-database

From:	Ludovic Courtès
Subject:	[bug#36630] [PATCH] guix: parallelize building the manual-database
Date:	Tue, 16 Jul 2019 23:14:48 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux)

Hello,

Arne Babenhauserheide <address@hidden> skribis:

> Ludovic Courtès <address@hidden> writes:

[...]

>> I picked the manual-database derivation returned for:
>>   guix environment --ad-hoc jupyter python-ipython python-ipykernel -n
>> (It has 3,046 entries.)
>
> How exactly did you run the derivation? I’d like to check it if you can
> give me the exact commandline to run (a command I can run repeatedly).

If you run the command above, it’ll list
/gnu/store/…-manual-database.drv.  So you can just run:

  guix build /gnu/store/…-manual-database.drv

or:

  guix build /gnu/store/…-manual-database.drv --check

if it had already been built before.

>> On a SSD and with a hot cache, on my 4-core laptop, I get 74s with
>> ‘master’, and 53s with this patch.
>
> I’m using a machine with 6 physical cores, hyperthreading, and an NVMe
> M.2 disk, so it is likely that it would not be disk-bound for me at 4
> threads.

The result may be entirely different with a spinning disk.  :-)

I’m not saying we should optimize for spinning disks, just that what you
see is at one end of the spectrum.

>> However, it will definitely not scale linearly, so we should probably
>> cap at 2 or 4 threads.  WDYT?
>
> Looking at the underlying action, this seems to be a task that scales
> pretty well. It just unpacks files into the disk-cache.
>
> It should also not consume much memory, so I don’t see a reason to
> artificially limit the number of threads.

On a many-core machine like we have in our build farm, with spinning
disks, I believe that using one thread per core would be
counterproductive.

>> Another issue with the patch is that the [n/total] counter does not grow
>> monotically now: it might temporally go backwards.  Consequently, at
>> -v1, users will see a progress bar that hesitates and occasionally goes
>> backward, which isn’t great.
>
> It typically jumps forward in the beginning and then stalls until the
> first manual page is finished.
>
> Since par-map uses a global queue of futures to process, and since the
> output is the first part of (compute-entry …), I don’t expect the
> progress to move backwards in ways a user sees: It could only move
> backwards during the initial step where all threads start at the same
> time, and there the initial output should be overwritten fast enough to
> not be noticeable.

Hmm, maybe.  I’m sure we’ll get reports saying this looks weird and
Something Must Absolutely Be Done About It.  :-)

But anyway, another issue is that we would need to honor
‘parallel-job-count’, which means using ‘n-par-map’, which doesn’t use
futures.

> Given that building manual pages is the most timeconsuming part when
> installing a small tool into my profile, I think it is worth the
> complexity. Especially because most of the complexity is being taken
> care of by (ice-9 threads par-map).

Just today I realized that the example above (with Jupyter) has so many
entries because of propagated inputs; in particular libxext along brings
1,000+ man pages.  We should definitely do something about these
packages.

Needs more thought…

Thanks,
Ludo’.

[Prev in Thread]

Current Thread

[Next in Thread]

[bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/12
- [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/15
  - [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/15
    - [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès <=
    - [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/17
    - [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/18
    - [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/18
    - [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/18
    - [bug#36630] [PATCH] guix: parallelize building the manual-database, Ludovic Courtès, 2019/07/18
    - [bug#36630] [PATCH] guix: parallelize building the manual-database, Arne Babenhauserheide, 2019/07/18

Prev by Date: [bug#36695] [PATCH 2/3] guix: ant-build-system: Put dummy project-name into default build.xml.
Next by Date: [bug#36615] [PATCH] gnu: Add opencascade-occt.
Previous by thread: [bug#36630] [PATCH] guix: parallelize building the manual-database
Next by thread: [bug#36630] [PATCH] guix: parallelize building the manual-database
Index(es):
- Date
- Thread