[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: wait -n misses signaled subprocess
From: |
Dale R. Worley |
Subject: |
Re: wait -n misses signaled subprocess |
Date: |
Wed, 24 Jan 2024 11:48:03 -0500 |
Steven Pelley <stevenpelley@gmail.com> writes:
> wait -n
> fails to return for processes that terminate due to a signal prior to
> calling wait -n. Instead, it returns 127 with an error that the
> process id cannot be found. Calling wait <pid> (without -n) then
> returns its exit code (e.g., 143).
My understanding is that this is how "wait" is expected to work, or at
least known to work, but mostly because that's how the *kernel* works.
"wait" without -n makes a system call which means "give me information
about a terminated subprocess". The termination (or perhaps
change-of-state) reports from subprocesses are queued up in the kernel
until the process retrieves them through "wait" system calls.
OTOH, "wait" with -n makes a system call which means "give me
information about my subprocess N".
In the first case, if the subprocess N has terminated, its report is
still queued and "wait" retrieves it. In the second case, if the
subprocess N has terminated, it doesn't exist and as the manual page
says "If id specifies a non-existent process or job, the return status
is 127."
What you're pointing out is that that creates a race condition when the
subprocess ends before the "wait". And it seems that the kernel has
enough information to tell "wait -n N", "process N doesn't exist, but
you do have a queued termination report for it". But it's not clear
that there's a way to ask the kernel for that information without
reading all the queued termination reports (and losing the ability to
return them for other "wait" calls).
Then again, I might be wrong.
Dale
Re: wait -n misses signaled subprocess,
Dale R. Worley <=