[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
wait -n misses signaled subprocess
From: |
Steven Pelley |
Subject: |
wait -n misses signaled subprocess |
Date: |
Mon, 22 Jan 2024 11:30:51 -0500 |
Hello,
I've encountered what I believe is a bug in bash's "wait -n". wait -n
fails to return for processes that terminate due to a signal prior to
calling wait -n. Instead, it returns 127 with an error that the
process id cannot be found. Calling wait <pid> (without -n) then
returns its exit code (e.g., 143). I expect wait -n to return each
process through successive calls to wait -n, which is the case for
processes that terminate in other manners even prior to calling wait
-n. Killing a process while the wait -n is actively blocking works
correctly. Test script at bottom.
The specific situation I encountered this is when trying to coordinate
my own cooperative exit and handling/propagating SIGTERM. If I
propagate this SIGTERM by killing multiple processes at once (kill
pid1 pid2 pid3 ...) the next call to wait -n will return 143 and
indicate a pid (via -p) but the next call to wait -n returns 127 as
all processes previously terminated. If any of the awaited processes
haven't yet terminated then you only discover the previously-killed
process whenever the next terminates. I have workarounds/I'm not
blocked but this seems a reasonable use case and worth sharing.
I've tried:
killing with SIGTERM and SIGALRM
killing from the test script, a subshell, and another terminal. I
don't believe this is related to kill being a builtin.
enabling job control (set -m)
bash versions 4.4.12, 5.2.15, 5.2.21. All linux arm64
Test script:
# change to test other signals
sig=TERM
echo "TEST: KILL PRIOR TO wait -n @${SECONDS}"
{ sleep 1; exit 1; } &
pid=$!
echo "kill -$sig $pid @${SECONDS}"
kill -$sig $pid
sleep 2
wait -n $pid
echo "wait -n $pid return code $? @${SECONDS} (BUG)"
wait $pid
echo "wait $pid return code $? @${SECONDS}"
echo "TEST: KILL DURING wait -n @${SECONDS}"
{ sleep 2; exit 1; } &
pid=$!
{ sleep 1; echo "kill -$sig $pid @${SECONDS}"; kill -$sig $pid; } &
wait -n $pid
echo "wait -n $pid return code $? @${SECONDS}"
wait $pid
echo "wait $pid return code $? @${SECONDS}"
For which I get the following example output:
TEST: KILL PRIOR TO wait -n @0
kill -TERM 1384 @0
./test.sh: line 14: 1384 Terminated { sleep 1; exit 1; }
wait -n 1384 return code 127 @2 (BUG)
wait 1384 return code 143 @2
TEST: KILL DURING wait -n @2
kill -TERM 1402 @3
./test.sh: line 25: 1402 Terminated { sleep 2; exit 1; }
wait -n 1402 return code 143 @3
wait 1402 return code 143 @3
I expect the line ending (BUG) to indicate a return code of 143.
Thanks,
Steve Pelley
- wait -n misses signaled subprocess,
Steven Pelley <=
- Re: wait -n misses signaled subprocess, Oğuz, 2024/01/24
- Re: wait -n misses signaled subprocess, Chet Ramey, 2024/01/28
- Re: wait -n misses signaled subprocess, Steven Pelley, 2024/01/28
- Re: wait -n misses signaled subprocess, Dale R. Worley, 2024/01/28
- Re: wait -n misses signaled subprocess, Chet Ramey, 2024/01/30
- Re: wait -n misses signaled subprocess, Steven Pelley, 2024/01/30
- Re: wait -n misses signaled subprocess, Chet Ramey, 2024/01/30