guix-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug#47137] [PATCH] Adaptive substitute decompression selection


From: Ludovic Courtès
Subject: [bug#47137] [PATCH] Adaptive substitute decompression selection
Date: Sun, 14 Mar 2021 15:38:47 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

Hi!

The patch below is a followup to the thread started in December:

  https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html

It provides a naïve but apparently good enough way for ‘guix substitute’
to choose the compression method that yields the best speed given the
CPU and current networking conditions.

On a recent x86_64 laptop with fast networking, using ci.guix.gnu.org,
the effect so far is to choose gzip substitutes, which indeed provides
slightly faster substitute installation.  When ci.guix provides zstd
substitutes, the speedup will be higher.

I have yet to check that it sticks to lzip when bandwidth is low.

Thoughts?

Thanks,
Ludo’.

>From 3f95a1ac04c5e178a7fedfc2d03c07bcb1075ead Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>
Date: Sun, 14 Mar 2021 15:05:30 +0100
Subject: [PATCH] substitute: Choose compression method based on past CPU
 usage.

This stems from the observation that substitute download can be
CPU-bound when high-speed networks are in use:

  https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html

* guix/narinfo.scm (decompresses-faster?): New procedure.
(narinfo-best-uri): Add #:fast-decompression?.
* guix/scripts/substitute.scm (%prefer-fast-decompression?): New
variable.
(call-with-cpu-usage-monitoring): New procedure.
(with-cpu-usage-monitoring): New macro.
(display-narinfo-data, process-substitution): Pass #:fast-decompression?
to 'narinfo-best-uri'.
(process-substitution): Wrap 'restore-file' call in
'with-cpu-usage-monitoring'.  Set '%prefer-fast-decompression?'.
---
 guix/narinfo.scm            | 27 ++++++++++++++++---
 guix/scripts/substitute.scm | 53 ++++++++++++++++++++++++++++++++-----
 2 files changed, 69 insertions(+), 11 deletions(-)

diff --git a/guix/narinfo.scm b/guix/narinfo.scm
index 2d06124017..72e0f75fda 100644
--- a/guix/narinfo.scm
+++ b/guix/narinfo.scm
@@ -1,5 +1,5 @@
 ;;; GNU Guix --- Functional package management for GNU
-;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020 Ludovic Courtès 
<ludo@gnu.org>
+;;; Copyright © 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 Ludovic 
Courtès <ludo@gnu.org>
 ;;; Copyright © 2014 Nikita Karetnikov <nikita@karetnikov.org>
 ;;; Copyright © 2018 Kyle Meyer <kyle@kyleam.com>
 ;;;
@@ -297,9 +297,21 @@ this is a rough approximation."
     (_      (or (string=? compression2 "none")
                 (string=? compression2 "gzip")))))
 
-(define (narinfo-best-uri narinfo)
+(define (decompresses-faster? compression1 compression2)
+  "Return true if COMPRESSION1 generally has a higher decompression throughput
+than COMPRESSION2."
+  (match compression1
+    ("none" #t)
+    ("zstd" #t)
+    ("gzip" (string=? compression2 "lzip"))
+    (_      #f)))
+
+(define* (narinfo-best-uri narinfo #:key fast-decompression?)
   "Select the \"best\" URI to download NARINFO's nar, and return three values:
-the URI, its compression method (a string), and the compressed file size."
+the URI, its compression method (a string), and the compressed file size.
+When FAST-DECOMPRESSION? is true, prefer substitutes with faster
+decompression (typically zstd) rather than substitutes with a higher
+compression ratio (typically lzip)."
   (define choices
     (filter (match-lambda
               ((uri compression file-size)
@@ -321,6 +333,13 @@ the URI, its compression method (a string), and the 
compressed file size."
           (compresses-better? compression1 compression2))))
       (_ #f)))                                    ;we can't tell
 
-  (match (sort choices file-size<?)
+  (define (speed<? c1 c2)
+    (match c1
+      ((uri1 compression1 . _)
+       (match c2
+         ((uri2 compression2 . _)
+          (decompresses-faster? compression2 compression1))))))
+
+  (match (sort choices (if fast-decompression? (negate speed<?) file-size<?))
     (((uri compression file-size) _ ...)
      (values uri compression file-size))))
diff --git a/guix/scripts/substitute.scm b/guix/scripts/substitute.scm
index 6892aa999b..b213e6da06 100755
--- a/guix/scripts/substitute.scm
+++ b/guix/scripts/substitute.scm
@@ -257,6 +257,27 @@ Internal tool to substitute a pre-built binary to a local 
build.\n"))
 ;;; Daemon/substituter protocol.
 ;;;
 
+(define %prefer-fast-decompression?
+  ;; Whether to prefer fast decompression over good compression ratios.  This
+  ;; serves in particular to choose between lzip (high compression ratio but
+  ;; low decompression throughput) and zstd (lower compression ratio but high
+  ;; decompression throughput).
+  #f)
+
+(define (call-with-cpu-usage-monitoring proc)
+  (let ((before (times)))
+    (proc)
+    (let ((after (times)))
+      (if (= (tms:clock after) (tms:clock before))
+          0
+          (/ (- (tms:utime after) (tms:utime before))
+             (- (tms:clock after) (tms:clock before))
+             1.)))))
+
+(define-syntax-rule (with-cpu-usage-monitoring exp ...)
+  "Evaluate EXP...  Return its CPU usage as a fraction between 0 and 1."
+  (call-with-cpu-usage-monitoring (lambda () exp ...)))
+
 (define (display-narinfo-data narinfo)
   "Write to the current output port the contents of NARINFO in the format
 expected by the daemon."
@@ -269,7 +290,10 @@ expected by the daemon."
   (for-each (cute format #t "~a/~a~%" (%store-prefix) <>)
             (narinfo-references narinfo))
 
-  (let-values (((uri compression file-size) (narinfo-best-uri narinfo)))
+  (let-values (((uri compression file-size)
+                (narinfo-best-uri narinfo
+                                  #:fast-decompression?
+                                  %prefer-fast-decompression?)))
     (format #t "~a\n~a\n"
             (or file-size 0)
             (or (narinfo-size narinfo) 0))))
@@ -438,7 +462,9 @@ the current output port."
            store-item))
 
   (let-values (((uri compression file-size)
-                (narinfo-best-uri narinfo)))
+                (narinfo-best-uri narinfo
+                                  #:fast-decompression?
+                                  %prefer-fast-decompression?)))
     (unless print-build-trace?
       (format (current-error-port)
               (G_ "Downloading ~a...~%") (uri->string uri)))
@@ -476,11 +502,24 @@ the current output port."
                   ((hashed get-hash)
                    (open-hash-input-port algorithm input)))
       ;; Unpack the Nar at INPUT into DESTINATION.
-      (restore-file hashed destination
-                    #:dump-file (if (and destination-in-store?
-                                         deduplicate?)
-                                    dump-file/deduplicate*
-                                    dump-file))
+      (define cpu-usage
+        (with-cpu-usage-monitoring
+         (restore-file hashed destination
+                       #:dump-file (if (and destination-in-store?
+                                            deduplicate?)
+                                       dump-file/deduplicate*
+                                       dump-file))))
+
+      ;; Create a hysteresis: depending on CPU usage, favor compression
+      ;; methods with faster decompression (like ztsd) or methods with better
+      ;; compression ratios (like lzip).  This stems from the observation that
+      ;; substitution can be CPU-bound when high-speed networks are used:
+      ;; <https://lists.gnu.org/archive/html/guix-devel/2020-12/msg00177.html>.
+      (when (> cpu-usage .8)
+        (set! %prefer-fast-decompression? #t))
+      (when (< cpu-usage .4)
+        (set! %prefer-fast-decompression? #f))
+
       (close-port hashed)
       (close-port input)
 
-- 
2.30.2


reply via email to

[Prev in Thread] Current Thread [Next in Thread]