bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

du counting files in hard-linked directories twice


From: Michael Schwarz
Subject: du counting files in hard-linked directories twice
Date: Thu, 27 Aug 2009 18:09:11 +0200

Hello everybody

The Problem I see as a bug is that, in some circumstances, du is counting the same files multiple times when they appear inside of hard- linked directories.

I have tested this with GNU coreutils 7.4 (MacPorts) and 7.5 (vanilla) on Mac OS X 10.5.8 on x86-64. I did all tests on an HFS+ volume, formatted as being non-cases-sensitive.

Judging from du.c, the problem seems to be that du only inserts a file into it's table of visited files if it's link count is greater than one (du.c, Lines 513-517):

  if (skip
      || (!opt_count_all
          && ! S_ISDIR (sb->st_mode)
          && 1 < sb->st_nlink
          && ! hash_ins (sb->st_ino, sb->st_dev)))

Replacing the 1 in the condition above with a 0 solves the problem, although it would probably be a much better idea to insert hard-linked directories into the hash table instead. This would also prevent du from counting the size of directory inodes twice.

Appended is a simple script which makes three tests:
        * One where there is a single file with two links.
        * One where there is a directory with two links containing one file
* One where there is a directory with two links containing two links to a single file.

After the script is it's output on my machine and the output of ls - ilR for reference.

With all the tests there is exactly one file, 1 MiB in size and only for the second test, du reports a total size of 2 MiB.

I hope you agree that this is a bug in du and that I did not miss a problem on my side in evaluating this. I will gladly be of any help I can on fixing this problem, e.g. testing patches for people who do not have a setup that may be used to reproduce this problem.

Thank you
Michael


$ cat test.sh
#! /usr/bin/env bash

set -e -o pipefail

# Test with a single file with two links.
(
        echo 'Test with file hard links:'
        rm -rf 'test-d1-f2'
        mkdir 'test-d1-f2'
        cd 'test-d1-f2'
        
        mkdir 'top-1'
        mkdir 'top-2'
        
        dd bs=1k count=1k < /dev/zero > 'top-1/file' 2> /dev/null
        link 'top-1/file' 'top-2/file'
        du -h
        echo
)

# Test with a directory with two links containing a single file with one link.
(
        echo 'Test with directory hard links:'
        rm -rf 'test-d2-f1'
        mkdir 'test-d2-f1'
        cd 'test-d2-f1'
        
# Because of a restriction in HFS+ or the Mac OS, we need to put the two directory entries for the hard-linked directory into sepparate directories.
        mkdir 'top-1'
        mkdir 'top-2'
        
        mkdir 'top-1/dir'
        link 'top-1/dir' 'top-2/dir'
        dd bs=1k count=1k < /dev/zero > 'top-1/dir/file' 2> /dev/null
        du -h
        echo
)

# Test with a directory with two links containing a single file with two links.
(
        echo 'Test with file and directory hard links:'
        rm -rf 'test-d2-f2'
        mkdir 'test-d2-f2'
        cd 'test-d2-f2'
        
        mkdir 'top-1'
        mkdir 'top-2'
        
        mkdir 'top-1/dir'
        link 'top-1/dir' 'top-2/dir'
        dd bs=1k count=1k < /dev/zero > 'top-1/dir/file-1' 2> /dev/null
        link 'top-1/dir/file-1' 'top-1/dir/file-2'
        du -h
        echo
)
$ ./test.sh
Test with file hard links:
1.0M    ./top-1
0       ./top-2
1.0M    .

Test with directory hard links:
1.0M    ./top-1/dir
1.0M    ./top-1
1.0M    ./top-2/dir
1.0M    ./top-2
2.0M    .

Test with file and directory hard links:
1.0M    ./top-1/dir
1.0M    ./top-1
0       ./top-2/dir
0       ./top-2
1.0M    .

$ ls -ilR
.:
total 4
12382224 drwxr-xr-x 4 michi michi  136 Aug 27 18:06 test-d1-f2
12382230 drwxr-xr-x 4 michi michi  136 Aug 27 18:06 test-d2-f1
12382237 drwxr-xr-x 4 michi michi  136 Aug 27 18:06 test-d2-f2
12382194 -rwxr-xr-x 1 michi michi 1207 Aug 27 18:06 test.sh

./test-d1-f2:
total 0
12382225 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382226 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2

./test-d1-f2/top-1:
total 1024
12382227 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file

./test-d1-f2/top-2:
total 1024
12382227 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file

./test-d2-f1:
total 0
12382231 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382232 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2

./test-d2-f1/top-1:
total 0
12382233 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 dir

./test-d2-f1/top-1/dir:
total 1024
12382236 -rw-r--r-- 1 michi michi 1048576 Aug 27 18:06 file

./test-d2-f1/top-2:
total 0
12382233 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 dir

./test-d2-f1/top-2/dir:
total 1024
12382236 -rw-r--r-- 1 michi michi 1048576 Aug 27 18:06 file

./test-d2-f2:
total 0
12382238 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382239 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2

./test-d2-f2/top-1:
total 0
12382240 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 dir

./test-d2-f2/top-1/dir:
total 2048
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-1
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-2

./test-d2-f2/top-2:
total 0
12382240 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 dir

./test-d2-f2/top-2/dir:
total 2048
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-1
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-2
$



reply via email to

[Prev in Thread] Current Thread [Next in Thread]