[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Support for --size in du
From: |
Bernhard Voelker |
Subject: |
Re: [PATCH] Support for --size in du |
Date: |
Thu, 17 Jan 2013 08:19:37 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130105 Thunderbird/17.0.2 |
On 01/17/2013 02:46 AM, Pádraig Brady wrote:
> On 01/17/2013 01:23 AM, Bernhard Voelker wrote:
>> I was pretty sure that this slipped also from Padraig's list.
>
> Sorry for the delay in this.
>
> Note it's still on the list:
> http://www.pixelbeat.org/patches/coreutils/inbox_dec_2012.html
>
> You can browse older news and subscribe to new updates at:
> http://www.pixelbeat.org/patches/coreutils/
Thanks for the links.
>> Therefore, I took Jakob's patch and amended it with documentation
>> and a comprehensive test. ;-)
>
> Wow great work on the test.
Well, that test just grew and grew. It's actually a result of
me not being 100% happy with the --size option as in some
situations it might confuse people more than it may help:
E.g. users usually tend to "think in apparent sizes" for their
files instead of block sizes.
Having a directory like this:
$ find tmp -exec ls -dog '{}' +
drwxr-xr-x 5 4096 Jan 17 07:28 tmp
drwxr-xr-x 2 4096 Jan 17 07:29 tmp/big_dir
-rw-r--r-- 1 104857600 Jan 17 07:29 tmp/big_dir/big_file
drwxr-xr-x 2 4096 Jan 17 07:25 tmp/empty_dir
drwxr-xr-x 2 4096 Jan 17 07:28 tmp/small_dir
-rw-r--r-- 1 6 Jan 17 07:26 tmp/small_dir/small_file
-rw-r--r-- 1 0 Jan 17 07:22 tmp/x0
-rw-r--r-- 1 1 Jan 17 07:22 tmp/x1
-rw-r--r-- 1 10 Jan 17 07:22 tmp/x2
-rw-r--r-- 1 100 Jan 17 07:22 tmp/x3
-rw-r--r-- 1 1000 Jan 17 07:22 tmp/x4
-rw-r--r-- 1 10000 Jan 17 07:22 tmp/x5
-rw-r--r-- 1 100000 Jan 17 07:22 tmp/x6
-rw-r--r-- 1 1000000 Jan 17 07:22 tmp/x7
Then filter files and directories greater/equal 4000:
$ src/du -B1 -a --size=4000 tmp | sort -k2
106012672 tmp
104861696 tmp/big_dir
104857600 tmp/big_dir/big_file
4096 tmp/empty_dir
8192 tmp/small_dir
4096 tmp/small_dir/small_file
4096 tmp/x1
4096 tmp/x2
4096 tmp/x3
4096 tmp/x4
12288 tmp/x5
102400 tmp/x6
1003520 tmp/x7
This included also the small files tmp/x1 while it left out
the empty file tmp/x0 ... but yet included the empty directory
tmp/empty_dir. This feels somehow counter-intuitive.
Now let's use the "apparent size":
$ src/du -B1 -a --size=4000 --app tmp | sort -k2
105985101 tmp
104861696 tmp/big_dir
104857600 tmp/big_dir/big_file
4096 tmp/empty_dir
4102 tmp/small_dir
10000 tmp/x5
100000 tmp/x6
1000000 tmp/x7
This is much better. Well, the empty directory still shows up
here (which might be different on a different file system),
but at least the small files have gone.
Thus said, it seems that automatically applying --apparent
when -a and --size is specified would give a more "natural"
result.
In practice, the users will probably only search for huge files
and directories, i.e. much greater than the file system's
block size, but even then they'd be trapped by forgetting the
--app option when it comes to sparse files:
$ src/truncate --size=1T tmp/sparse-1T
$ src/du -h -a --size=100M tmp
100M tmp/big_dir/big_file
101M tmp/big_dir
102M tmp
$ src/du -h -a --size=100M --app tmp
100M tmp/big_dir/big_file
101M tmp/big_dir
1.0T tmp/sparse-1T
1.1T tmp
The only way out of this - probably only my - confusion would
be to prevent the use of the -a and the --size option together.
But this would artificially restrict the user's flexibility.
Does anyone else have such a feeling, too?
> I wonder would it make sense to have consistent --size
> handling for du and truncate. I.E. have --size='<10M'
> specify the max size and --size='>10M' specify the min size?
I personally do not like shell-special characters in optargs
too much, as many users will forget to put it into quotes;
--size=<10M may not be a great problem, but --size=>10M
may destroy data.
I was rather thinking that to make it more consistent with
"find tmp -size +10M", or even to teach find a new -csize
(cumulative size) option ... as finding big directories was
the original problem. On the other side, 'find' doesn't offer
the flexibility to filter based on the block size, i.e. it
would always include huge sparse files although these do
not fill up the file system.
Maybe the current implementation is still the better way ...
Have a nice day,
Berny
P.S. Thanks for reading down here. ;-)
- Re: [PATCH] Support for --size in du, Bernhard Voelker, 2013/01/16
- Re: [PATCH] Support for --size in du, Pádraig Brady, 2013/01/16
- Re: [PATCH] Support for --size in du,
Bernhard Voelker <=
- Re: [PATCH] Support for --size in du, Pádraig Brady, 2013/01/17
- Re: [PATCH] Support for --size in du, Likai Liu, 2013/01/17
- Re: [PATCH] Support for --size in du, Pádraig Brady, 2013/01/17
- Re: [PATCH] Support for --size in du, Bernhard Voelker, 2013/01/19
- Re: [PATCH] Support for --size in du, Likai Liu, 2013/01/19
- Re: [PATCH] Support for --size in du, Pádraig Brady, 2013/01/20
- Re: [PATCH] Support for --size in du, Sami Kerola, 2013/01/20
- Re: [PATCH] Support for --size in du, Pádraig Brady, 2013/01/20
- Re: [PATCH] Support for --size in du, Bernhard Voelker, 2013/01/20
- Re: [PATCH] Support for --size in du, Pádraig Brady, 2013/01/20