[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: datamash performance question
From: |
Erik Auerswald |
Subject: |
Re: datamash performance question |
Date: |
Sat, 26 Jun 2021 19:36:06 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Hi,
On Fri, Jun 25, 2021 at 05:36:26PM -0400, Jake VanEck wrote:
> Any way to run datamash in parallel?
You can try GNU parallel (https://www.gnu.org/software/parallel/) or
xargs --max-procs to start several GNU datamash processes.
GNU parallel should give you more control over how to provide input to
the GNU datamash processes and might be a better fit than xargs.
I can only provide those pointers, because I have not used GNU parallel,
and I expect the compliacted part to be dividing one input stream into
several for independent processes, and then combining those to produce
the end result.
HTH,
Erik
--
Do things that have never been done before.
-- Russell Kirsch