[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?
From: |
Peng Yu |
Subject: |
Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs? |
Date: |
Sat, 27 Jan 2018 10:39:01 -0600 |
> Is your find binary built with D_TYPE support?
>
> $ find --version
> find (GNU findutils) 4.6.0
> Copyright (C) 2015 Free Software Foundation, Inc.
> ...
> Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION
> FTS(FTS_CWDFD) CBO(level=2)
> ____________________^^^^^^
$ find --version
find (GNU findutils) 4.6.0
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Eric B. Decker, James Youngman, and Kevin Dalley.
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION
FTS(FTS_CWDFD) CBO(level=2)
> Would you please try to reproduce this on a local file system, e.g. ext4?
It is much faster.
$ time find -maxdepth 1 -name '*.tsv' |wc -l
8026
real 0m0.106s
user 0m0.062s
sys 0m0.060s
> Finally, use "strace -v find ..." so that we see whether the 'getdents'
> system call returns D_TYPE information:
Here is the first 100 lines of the output of running `strace -ve
getdents find -maxdepth 1 -name '*.tsv'`.
https://pastebin.com/XxfFJJj4
> $ strace -ve getdents find -maxdepth 1 -name '*.tsv'
> getdents(4, [{d_ino=4276237, d_off=4278742733963192100, d_reclen=24,
> d_name=".", d_type=DT_DIR},
> {d_ino=4055085, d_off=8511941719133486354, d_reclen=24,
> d_name="..", d_type=DT_DIR},
> {d_ino=4276239, d_off=9223372036854775807, d_reclen=24,
> d_name="file", d_type=DT_REG}],
> 32768) = 72
> getdents(4, [], 32768) = 0
>
> In the end, it may turn out that either your 'find' binary is not compiled
> with D_TYPE support, or that glusterfs doesn't provide this information
> (and therefore find needs to invoke the additional newfstatat()s.
Let me know what case is it for my example.
--
Regards,
Peng
- Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Peng Yu, 2018/01/20
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Dale R. Worley, 2018/01/20
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, James Youngman, 2018/01/21
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Dale R. Worley, 2018/01/21
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Peng Yu, 2018/01/23
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Morgan Weetman, 2018/01/23
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Bernhard Voelker, 2018/01/24
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Peng Yu, 2018/01/24
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Bernhard Voelker, 2018/01/27
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?,
Peng Yu <=
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Bernhard Voelker, 2018/01/27
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Peng Yu, 2018/01/27
- Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?, Bernhard Voelker, 2018/01/28