[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Combination of "eval set -- ..." and $() command substitution is slow
From: |
astian |
Subject: |
Combination of "eval set -- ..." and $() command substitution is slow |
Date: |
Wed, 10 Jul 2019 17:21:00 +0000 |
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -g -O2 -fdebug-prefix-map=/tmp/bash/bash-5.0=.
-fstack-protector-strong -Wformat -Werror=format-security -Wall
-Wno-parentheses -Wno-format-security
uname output: Linux $nodename 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u2
(2019-05-13) x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu
Bash Version: 5.0
Patch Level: 3
Release Status: release
Description:
I discovered a curious performance degradation in the combined usage of the
constructs "eval set -- ..." and new-style command substitution. In short,
setting the positional arguments via eval and then iterating over each one
while performing $() command substitution(s) is significantly slower than
not using eval, or not making command substitution, or using `` instead.
I include below a reduced test script that illustrates the issue. A few
notes:
- The pathological case is "1 1 0".
- I did not observe performance difference in unoptimised builds (-O0).
Example output:
case 0 0 X
set
real 0m0.001s
user 0m0.000s
sys 0m0.000s
for loop
real 0m0.008s
user 0m0.008s
sys 0m0.000s
--------------------------
case 0 1 0
set
real 0m0.001s
user 0m0.000s
sys 0m0.000s
for loop cmdsubst-currency
real 0m0.551s
user 0m0.048s
sys 0m0.144s
--------------------------
case 0 1 1
set
real 0m0.001s
user 0m0.000s
sys 0m0.000s
for loop cmdsubst-grave
real 0m0.520s
user 0m0.048s
sys 0m0.116s
--------------------------
case 1 0 X
eval set
real 0m0.002s
user 0m0.004s
sys 0m0.000s
for loop
real 0m0.008s
user 0m0.008s
sys 0m0.000s
--------------------------
case 1 1 0
eval set
real 0m0.002s
user 0m0.000s
sys 0m0.000s
for loop cmdsubst-currency
real 0m0.968s
user 0m0.432s
sys 0m0.148s
--------------------------
case 1 1 1
eval set
real 0m0.002s
user 0m0.004s
sys 0m0.000s
for loop cmdsubst-grave
real 0m0.529s
user 0m0.044s
sys 0m0.124s
--------------------------
Observations:
- The pathological case "1 1 0" spends about 10 times more time doing
something in userspace during the loop, relative to the comparable cases
"0 1 0", "0 1 1", and "1 1 1".
- $() seems generally slightly slower than ``, but becomes pathologically
so when preceded with "eval set -- ...".
- "eval set -- ..." itself doesn't seem slow at all, but obviously it has
side-effects not captured by the "time" measurement tool.
I tested an --enable-profiling build but the results are vastly different
between test cases so I wasn't able to gain much insight by comparing them.
(Also I'm inexperienced with gprof.) For reference, though, these are the
top 10 functions in the flat listing for 3 of the test cases:
case 0 1 0:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
18.75 0.09 0.09 4801 0.02 0.02 make_child
13.54 0.16 0.07 1200 0.05 0.05 reap_dead_jobs
13.54 0.22 0.07 sigchld_handler
6.25 0.25 0.03 38586 0.00 0.00
read_token.constprop.9
6.25 0.28 0.03 7259 0.00 0.04 param_expand
4.17 0.30 0.02 9609 0.00 0.01 yyparse
2.08 0.31 0.01 28910 0.00 0.00 assignment
2.08 0.32 0.01 28888 0.00 0.00 set_line_mbstate
2.08 0.33 0.01 14417 0.00 0.00 set_signal_handler
2.08 0.34 0.01 9643 0.00 0.00 zread
case 1 1 1:
% cumulative self self total
time seconds seconds calls us/call us/call name
31.03 0.09 0.09 4801 18.75 18.75 make_child
15.52 0.14 0.05 1200 37.50 37.50 reap_dead_jobs
15.52 0.18 0.05 sigchld_handler
6.90 0.20 0.02 9889 2.02 2.02 hash_search
3.45 0.21 0.01 14464 0.69 0.69 dequote_string
3.45 0.22 0.01 8461 1.18 16.14 param_expand
3.45 0.23 0.01 4808 2.08 3.12 run_sigchld_trap
3.45 0.24 0.01 4801 2.08 23.05 command_substitute
3.45 0.25 0.01 4801 2.08 2.09 wait_for
3.45 0.26 0.01 4800 2.08 2.08 string_extract
case 1 1 0 (pathological):
% cumulative self self total
time seconds seconds calls us/call us/call name
38.89 0.21 0.21 28890 7.27 7.27 set_line_mbstate
7.41 0.25 0.04 9610 4.16 22.19 yyparse
6.48 0.29 0.04 1200 29.17 29.17 reap_dead_jobs
6.48 0.32 0.04 sigchld_handler
5.56 0.35 0.03 4801 6.25 6.25 make_child
3.70 0.37 0.02 10848 1.84 1.84
builtin_address_internal
1.85 0.38 0.01 98462 0.10 0.10 unwind_protect_mem
1.85 0.39 0.01 56852 0.18 0.18 make_word_list
1.85 0.40 0.01 9656 1.04 1.04
unquoted_glob_pattern_p
1.85 0.41 0.01 9622 1.04 3.27 parse_matched_pair
Anyway, as you can see I already found a workaround (`` instead of $()) that
restores most of the lost performance, so I'm not overly concerned with
getting to the bottom of this. I just thought it was curious enough to
deserve being reported.
Thanks.
Repeat-By:
-----8<--------------- begin test.sh ----------------------
f()
{
[ $# -eq 3 ] || return
local doeval=$1
local docmdsubst=$2
local docmdsubstgrave=$3
local a=$(seq -s ' ' 1200)
if [ "$doeval" -eq 1 ]; then
echo -n eval set
time eval "set -- $a"
else
echo -n set
time set -- $a
fi
local i
if [ "$docmdsubst" -eq 1 ]; then
if [ "$docmdsubstgrave" -eq 1 ]; then
echo -n for loop cmdsubst-grave
time for i; do
printf '%s\n' "`printf %s "$i"`"
done >/dev/null
else
echo -n for loop cmdsubst-currency
time for i; do
printf '%s\n' "$(printf %s "$i")"
done >/dev/null
fi
else
echo -n for loop
time for i; do
printf '%s\n' "$i"
done >/dev/null
fi
}
[ $# -eq 3 ] || exit 1
echo case $1 $2 $3
f $1 $2 $3
echo --------------------------
---------------------- end test.sh -------------->8------
-----8<--------------- begin main.sh ----------------------
BASH_PROGRAM=${1:-bash}
PROCESS_PROFILE_DATA=$2
for I in 0 1; do for J in 0 1; do for K in 0 1; do
if [ $J -eq 0 ]; then
# K becomes a "don't care" condition
if [ $K -eq 0 ]; then
K=X
else
continue
fi
fi
$BASH_PROGRAM test.sh $I $J $K
if [ -n "$PROCESS_PROFILE_DATA" ]; then
mv gmon.out gmon.out.$I-$J-$K
gprof $BASH_PROGRAM gmon.out.$I-$J-$K >
gmon.out.$I-$J-$K.table
fi
done; done; done
---------------------- end main.sh -------------->8------
Run as "bash main.sh", for example.