help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Should launching background jobs inside a loop cause a race conditio


From: Pablo Repetto
Subject: Re: Should launching background jobs inside a loop cause a race condition?
Date: Thu, 19 Dec 2024 10:14:20 -0300

Thanks for your response, Greg! And thank you for maintaining your
wiki, it's a real gem!

What I'm trying to do (and can do, in and bash and dash) is to set up
a parallel processing pipeline with only the POSIX shell, using the
technique from the following article:

https://catern.com/posts/pipes.html

tl;dr: The form

  src | pad | { unpad | worker &
                unpad | worker &
                unpad | worker & } | sink

places the workers in contention for the contents of the pipe. The pad
and unpad functions (simple wrappers around dd) force the data stream
into a form that can be read/written atomically, feeding exactly one
line of input to each worker at a time.



Pipe atomicity is guaranteed by bash, and I believe also by POSIX.

> Reading or writing pipe data is atomic if the size of data written is not 
> greater than PIPE_BUF. This means that the data transfer seems to be an 
> instantaneous unit, in that nothing else in the system can observe a state in 
> which it is partially complete. Atomic I/O may not begin right away (it may 
> need to wait for buffer space or for data), but once it does begin it 
> finishes immediately.
>
> Reading or writing a larger amount of data may not be atomic; for example, 
> output data from other processes sharing the descriptor may be interspersed. 
> Also, once PIPE_BUF characters have been written, further writes will block 
> until some characters are read.
>
> -- https://www.gnu.org/software/libc/manual/html_node/Pipe-Atomicity.html

> I/O is intended to be atomic to ordinary files and pipes and FIFOs. Atomic 
> means that all the bytes from a single operation that started out together 
> end up together, without interleaving from other I/O operations. It is a 
> known attribute of terminals that this is not honored, and terminals are 
> explicitly (and implicitly permanently) excepted, making the behavior 
> unspecified. The behavior for other device types is also left unspecified, 
> but the wording is intended to imply that future standards might choose to 
> specify atomicity (or not).
>
> -- 
> https://pubs.opengroup.org/onlinepubs/9799919799/functions/pread.html#tag_17_476_08_01.

Where POSIX says "terminals are explicitly excepted" I believe that
means either:
- that reads/writes using shell builtins (read/printf/echo) are excepted, or
- that reads/writes by the terminal/terminal emulator (not a shell) are excepted

So the read/writes from dd, which we control with fine granularity,
are guaranteed not to mangle input.



Anyway, what I'm trying to figure out is whether I'm in a 'Starting a
"daemon" and checking whether it started successfully' situation, or
whether redirections into background jobs *should* happen before other
commands start running, and this is just a bug in bash and dash.



PS: Sorry if this ends up in a new thread, this is my first time using
a mailing list, and I'm not 100% sure if I'm using it correctly.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]