[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bash script for setting off concurrent psql queries
From: |
Greg Wooledge |
Subject: |
Re: Bash script for setting off concurrent psql queries |
Date: |
Sun, 1 May 2022 18:30:46 -0400 |
On Sun, May 01, 2022 at 10:32:01PM +0100, Shaozhong SHI wrote:
> What is the best way to set off concurrent psql queries. I am sure that
> there must be a variety of ways to manage concurrent psql/postgres queries.
Running multiple background processes in parallel is simple. Just
end each of them with & to make them background jobs.
But I have an issue with even running *one* psql query from a bash
script, let alone several. The issue is this: given a query that
returns more than one field, how does bash know where each field
begins and ends, within the psql output stream?
On top of this, you have the inherent complexity of managing several
parallel processes, each of which is feeding you a separate data
stream. You need some way to identify which stream is which, and
read it.
When people run a *single* data-producing command from a bash script,
they will typically either set it up as part of a pipeline, or as part
of a command substitution. I.e.
psql 'stuff' | some parser
or
somevariable=$(psql 'stuff')
But with parallel processes, you can't do that. The cleanest way to
manage parallel data streams is to redirect each one to a separate
temporary file, and then read each temporary file once you've learned
that its producing command has finished.
Setting up a loop to manage that is not quite as simple as it sounds,
but not *too* onerous. Bash's "wait -n -p VAR" could be helpful here.
However, the original objection -- that parsing psql's output in a shell
script is not feasible in the general case -- makes the rest of this
academic.