[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug#49149] [PATCH v2 5/7] pack: Prevent duplicate files in tar archives
From: |
Maxim Cournoyer |
Subject: |
[bug#49149] [PATCH v2 5/7] pack: Prevent duplicate files in tar archives. |
Date: |
Thu, 24 Jun 2021 00:40:47 -0400 |
Tar translate duplicate files in the archive into hard links. These can cause
problems, as not every tool support them; for example dpkg doesn't.
* gnu/system/file-systems.scm (reduce-directories): New procedure.
(file-prefix?): Lift the restriction on file prefix. The procedure can be
useful for comparing relative file names. Adjust doc.
(file-name-depth): New procedure, extracted from ...
(btrfs-store-subvolume-file-name): ... here.
* guix/scripts/pack.scm (self-contained-tarball/builder): Use
reduce-directories.
* tests/file-systems.scm ("reduce-directories"): New test.
---
gnu/system/file-systems.scm | 56 +++++++++++++++++++++++++------------
guix/scripts/pack.scm | 6 ++--
tests/file-systems.scm | 7 ++++-
3 files changed, 48 insertions(+), 21 deletions(-)
diff --git a/gnu/system/file-systems.scm b/gnu/system/file-systems.scm
index 464e87cb18..fb87bfc85b 100644
--- a/gnu/system/file-systems.scm
+++ b/gnu/system/file-systems.scm
@@ -55,6 +55,7 @@
file-system-dependencies
file-system-location
+ reduce-directories
file-system-type-predicate
btrfs-subvolume?
btrfs-store-subvolume-file-name
@@ -231,8 +232,8 @@
(char-set-complement (char-set #\/)))
(define (file-prefix? file1 file2)
- "Return #t if FILE1 denotes the name of a file that is a parent of FILE2,
-where both FILE1 and FILE2 are absolute file name. For example:
+ "Return #t if FILE1 denotes the name of a file that is a parent of FILE2.
+For example:
(file-prefix? \"/gnu\" \"/gnu/store\")
=> #t
@@ -240,19 +241,41 @@ where both FILE1 and FILE2 are absolute file name. For
example:
(file-prefix? \"/gn\" \"/gnu/store\")
=> #f
"
- (and (string-prefix? "/" file1)
- (string-prefix? "/" file2)
- (let loop ((file1 (string-tokenize file1 %not-slash))
- (file2 (string-tokenize file2 %not-slash)))
- (match file1
- (()
- #t)
- ((head1 tail1 ...)
- (match file2
- ((head2 tail2 ...)
- (and (string=? head1 head2) (loop tail1 tail2)))
- (()
- #f)))))))
+ (let loop ((file1 (string-tokenize file1 %not-slash))
+ (file2 (string-tokenize file2 %not-slash)))
+ (match file1
+ (()
+ #t)
+ ((head1 tail1 ...)
+ (match file2
+ ((head2 tail2 ...)
+ (and (string=? head1 head2) (loop tail1 tail2)))
+ (()
+ #f))))))
+
+(define (file-name-depth file-name)
+ (length (string-tokenize file-name %not-slash)))
+
+(define (reduce-directories file-names)
+ "Eliminate entries in FILE-NAMES that are children of other entries in
+FILE-NAMES. This is for example useful when passing a list of files to GNU
+tar, which would otherwise descend into each directory passed and archive the
+duplicate files as hard links, which can be undesirable."
+ (let* ((file-names/sorted
+ ;; Ascending sort by file hierarchy depth, then by file name length.
+ (stable-sort (delete-duplicates file-names)
+ (lambda (f1 f2)
+ (let ((depth1 (file-name-depth f1))
+ (depth2 (file-name-depth f2)))
+ (if (= depth1 depth2)
+ (string< f1 f2)
+ (< depth1 depth2)))))))
+ (reverse (fold (lambda (file-name results)
+ (if (find (cut file-prefix? <> file-name) results)
+ results ;parent found -- skipping
+ (cons file-name results)))
+ '()
+ file-names/sorted))))
(define* (file-system-device->string device #:key uuid-type)
"Return the string representations of the DEVICE field of a <file-system>
@@ -624,9 +647,6 @@ store is located, else #f."
s
(string-append "/" s)))
- (define (file-name-depth file-name)
- (length (string-tokenize file-name %not-slash)))
-
(and-let* ((btrfs-subvolume-fs (filter btrfs-subvolume? file-systems))
(btrfs-subvolume-fs*
(sort btrfs-subvolume-fs
diff --git a/guix/scripts/pack.scm b/guix/scripts/pack.scm
index ad432f2b63..84f2f14343 100644
--- a/guix/scripts/pack.scm
+++ b/guix/scripts/pack.scm
@@ -230,13 +230,15 @@ its source property."
`((guix build pack)
(guix build utils)
(guix build union)
- (gnu build install))
+ (gnu build install)
+ (gnu system file-systems))
#:select? import-module?)
#~(begin
(use-modules (guix build pack)
(guix build utils)
((guix build union) #:select (relative-file-name))
(gnu build install)
+ ((gnu system file-systems) #:select (reduce-directories))
(srfi srfi-1)
(srfi srfi-26)
(ice-9 match))
@@ -303,7 +305,7 @@ its source property."
,(string-append "." (%store-directory))
- ,@(delete-duplicates
+ ,@(reduce-directories
(filter-map (match-lambda
(('directory directory)
(string-append "." directory))
diff --git a/tests/file-systems.scm b/tests/file-systems.scm
index 7f7c373884..80acb6d5b9 100644
--- a/tests/file-systems.scm
+++ b/tests/file-systems.scm
@@ -1,6 +1,6 @@
;;; GNU Guix --- Functional package management for GNU
;;; Copyright © 2015, 2017 Ludovic Courtès <ludo@gnu.org>
-;;; Copyright © 2020 Maxim Cournoyer <maxim.cournoyer@gmail.com>
+;;; Copyright © 2020, 2021 Maxim Cournoyer <maxim.cournoyer@gmail.com>
;;;
;;; This file is part of GNU Guix.
;;;
@@ -50,6 +50,11 @@
(device "/foo")
(flags '(bind-mount read-only)))))))))
+(test-equal "reduce-directories"
+ '("./opt/gnu/" "./opt/gnuism" "a/b/c")
+ (reduce-directories '("./opt/gnu/etc" "./opt/gnu/" "./opt/gnu/bin"
+ "./opt/gnu/lib/debug" "./opt/gnuism" "a/b/c" "a/b/c")))
+
(test-assert "does not pull (guix config)"
;; This module is meant both for the host side and "build side", so make
;; sure it doesn't pull in (guix config), which depends on the user's
--
2.32.0
- [bug#49149] [PATCH v2 2/7] pack: Factorize base tar options., (continued)
- [bug#49149] [PATCH v2 2/7] pack: Factorize base tar options., Maxim Cournoyer, 2021/06/24
- [bug#49149] [PATCH v2 3/7] pack: Fix typo., Maxim Cournoyer, 2021/06/24
- [bug#49149] [PATCH v2 4/7] pack: Improve naming of the packs store file names., Maxim Cournoyer, 2021/06/24
- [bug#49149] [PATCH 0/7] Add deb format for guix pack., Maxim Cournoyer, 2021/06/26
- [bug#49149] [PATCH 0/7] Add deb format for guix pack., Ludovic Courtès, 2021/06/30
- [bug#49149] [PATCH v2 6/7] tests: pack: Fix compressor extension., Maxim Cournoyer, 2021/06/24
- [bug#49149] [PATCH v2 7/7] pack: Add support for the deb format., Maxim Cournoyer, 2021/06/24
- [bug#49149] [PATCH v2 7/7] pack: Add support for the deb format., Maxime Devos, 2021/06/26
- bug#49149: [PATCH 0/7] Add deb format for guix pack., Maxim Cournoyer, 2021/06/29
- [bug#49149] [PATCH 0/7] Add deb format for guix pack., Ludovic Courtès, 2021/06/30
- [bug#49149] [PATCH v2 5/7] pack: Prevent duplicate files in tar archives.,
Maxim Cournoyer <=
- [bug#49149] [PATCH 4/7] pack: Improve naming of the packs store file names., Maxim Cournoyer, 2021/06/24
- [bug#49149] [PATCH 0/7] Add deb format for guix pack., Ludovic Courtès, 2021/06/23
[bug#49149] [PATCH 5/7] pack: Prevent duplicate files in tar archives., Maxim Cournoyer, 2021/06/21
[bug#49149] Add deb format for guix pack., jgart, 2021/06/21
[bug#49149] [PATCH 0/7] Add deb format for guix pack., Ludovic Courtès, 2021/06/23