[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Should launching background jobs inside a loop cause a race conditio
From: |
Pablo Repetto |
Subject: |
Re: Should launching background jobs inside a loop cause a race condition? |
Date: |
Thu, 19 Dec 2024 10:14:20 -0300 |
Thanks for your response, Greg! And thank you for maintaining your
wiki, it's a real gem!
What I'm trying to do (and can do, in and bash and dash) is to set up
a parallel processing pipeline with only the POSIX shell, using the
technique from the following article:
https://catern.com/posts/pipes.html
tl;dr: The form
src | pad | { unpad | worker &
unpad | worker &
unpad | worker & } | sink
places the workers in contention for the contents of the pipe. The pad
and unpad functions (simple wrappers around dd) force the data stream
into a form that can be read/written atomically, feeding exactly one
line of input to each worker at a time.
Pipe atomicity is guaranteed by bash, and I believe also by POSIX.
> Reading or writing pipe data is atomic if the size of data written is not
> greater than PIPE_BUF. This means that the data transfer seems to be an
> instantaneous unit, in that nothing else in the system can observe a state in
> which it is partially complete. Atomic I/O may not begin right away (it may
> need to wait for buffer space or for data), but once it does begin it
> finishes immediately.
>
> Reading or writing a larger amount of data may not be atomic; for example,
> output data from other processes sharing the descriptor may be interspersed.
> Also, once PIPE_BUF characters have been written, further writes will block
> until some characters are read.
>
> -- https://www.gnu.org/software/libc/manual/html_node/Pipe-Atomicity.html
> I/O is intended to be atomic to ordinary files and pipes and FIFOs. Atomic
> means that all the bytes from a single operation that started out together
> end up together, without interleaving from other I/O operations. It is a
> known attribute of terminals that this is not honored, and terminals are
> explicitly (and implicitly permanently) excepted, making the behavior
> unspecified. The behavior for other device types is also left unspecified,
> but the wording is intended to imply that future standards might choose to
> specify atomicity (or not).
>
> --
> https://pubs.opengroup.org/onlinepubs/9799919799/functions/pread.html#tag_17_476_08_01.
Where POSIX says "terminals are explicitly excepted" I believe that
means either:
- that reads/writes using shell builtins (read/printf/echo) are excepted, or
- that reads/writes by the terminal/terminal emulator (not a shell) are excepted
So the read/writes from dd, which we control with fine granularity,
are guaranteed not to mangle input.
Anyway, what I'm trying to figure out is whether I'm in a 'Starting a
"daemon" and checking whether it started successfully' situation, or
whether redirections into background jobs *should* happen before other
commands start running, and this is just a bug in bash and dash.
PS: Sorry if this ends up in a new thread, this is my first time using
a mailing list, and I'm not 100% sure if I'm using it correctly.