bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Race condition between --remove-files and input stream


From: Christopher Harrison
Subject: Race condition between --remove-files and input stream
Date: Thu, 3 Dec 2020 18:16:42 +0000
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:78.0) Gecko/20100101 Thunderbird/78.5.0

Due to a dodgy SQL query, I discovered a curious feature of GNU tar:

   $ touch foo
   $ tar cf foo.tar foo foo
   $ tar tvf foo.tar
   -rw-r--r-- ch12/team166      0 2020-12-02 17:34 foo
   hrw-r--r-- ch12/team166      0 2020-12-02 17:34 foo link to foo

i.e., You can add the same file into a tarball and it will add it as a link, internally. I don't really know what the point of this is, but I digress.

The failure mode I encountered combines this feature with --remove-files, where the input files are piped in via xargs:

   some-process | xargs tar cf foo.tar --remove-files

If the duplicated file paths are separated by a sufficiently high number of other paths, they may span the input buffer. (At least this is my working theory.) Something like this:

   file1
   lots
   of
   other
   -- BUFFER PARTITION --
   stuff
   file1

Here, "file1" will be deleted at the partition point -- presumably when the buffer is flushed -- so when it's reached again, in a subsequent partition, it can't be added as a link to the tar file because it's already been deleted. At this point, tar complains with a file not found error:

tar: /path/to/really/long/filename: Cannot stat: No such file or directory
   tar: Exiting with failure status due to previous errors

Steps to replicate:

declare FILE1="$(pwd)/$(dd if=/dev/urandom | tr -cd 'a-zA-Z0-9' | head -c100)" declare FILE2="$(pwd)/$(dd if=/dev/urandom | tr -cd 'a-zA-Z0-9' | head -c100)"
   touch "$FILE1" "$FILE2"
   printf "%s\n" "$FILE1" "$FILE2" "$FILE2" "$FILE1" "$FILE2" "$FILE1" \
   | xargs tar cPzf test.tar.gz --remove-files

--
Thanks;
Chris


--
The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]