guix-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[no subject]


From: Ludovic Courtès
Date: Thu, 16 Nov 2023 17:24:21 -0500 (EST)

branch: master
commit 38d864defc4ffe9aaada1aa36dadc311df08013a
Author: Ludovic Courtès <ludo@gnu.org>
AuthorDate: Thu Nov 16 12:13:52 2023 +0100

    remote-server: Catch ZeroMQ errors when replying to workers.
    
    This fixes a bug whereby EHOSTUNREACH would cause ‘cuirass
    remote-server’ to exit.
    
    Fixes <https://issues.guix.gnu.org/67224>.
    
    * src/cuirass/scripts/remote-server.scm (serve-build-requests): Catch
    ‘zmq-error’ when calling ‘reply-worker’.
---
 src/cuirass/scripts/remote-server.scm | 37 +++++++++++++++++++++++++----------
 1 file changed, 27 insertions(+), 10 deletions(-)

diff --git a/src/cuirass/scripts/remote-server.scm 
b/src/cuirass/scripts/remote-server.scm
index 25a605a..3ebb0e7 100644
--- a/src/cuirass/scripts/remote-server.scm
+++ b/src/cuirass/scripts/remote-server.scm
@@ -450,8 +450,12 @@ FETCH-WORKER to download the build's output(s)."
           (`(worker-ready ,worker)
            (update-worker! worker))
           (`(worker-request-info)
-           (reply-worker
-            (server-info-message sender-address (%log-port) (%publish-port))))
+           (catch 'zmq-error
+             (lambda ()
+               (reply-worker
+                (server-info-message sender-address
+                                     (%log-port) (%publish-port))))
+             (const #f)))
           (`(worker-request-work ,name)
            (let ((worker (db-get-worker name)))
              (when worker
@@ -471,19 +475,32 @@ FETCH-WORKER to download the build's output(s)."
                                   derivation))
                      (db-update-build-worker! derivation name)
                      (db-update-build-status! derivation (build-status 
submitted))
-                     (reply-worker
-                      (build-request-message derivation
-                                             #:priority priority
-                                             #:timeout timeout
-                                             #:max-silent max-silent
-                                             #:system (build-system build))))
+                     (catch 'zmq-error
+                       (lambda ()
+                         (reply-worker
+                          (build-request-message derivation
+                                                 #:priority priority
+                                                 #:timeout timeout
+                                                 #:max-silent max-silent
+                                                 #:system (build-system
+                                                            build))))
+                       (lambda (key errno message . _)
+                         (log-error "while submitting ~a to ~a (~a): ~a"
+                                    derivation
+                                    (worker-name worker)
+                                    (worker-address worker)
+                                    message)
+                         (db-update-build-status! derivation
+                                                  (build-status scheduled)))))
                    (begin
                      (when worker
                        (log-debug "~a (~a): no available build."
                                   (worker-address worker)
                                   (worker-name worker)))
-                     (reply-worker
-                      (no-build-message)))))))
+                     (catch 'zmq-error
+                       (lambda ()
+                         (reply-worker (no-build-message)))
+                       (const #f)))))))
           (`(worker-ping ,worker)
            (update-worker! worker))
           (`(build-started (drv ,drv) (worker ,name))



reply via email to

[Prev in Thread] Current Thread [Next in Thread]