parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallel refusing to take my request


From: Ecks Hecker
Subject: Re: parallel refusing to take my request
Date: Wed, 23 Dec 2020 11:59:48 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

Tx, John,  for looking into this.

Unfortunately, your suggestion did not help. I noticed, how the process
hung itself and i could find 24 threads waiting for input from a pipe.
Apparently --pipe and -a do not go well together.
As i was still playing around with this, i was in fact using ::::
(similar to -a), but could only feed 1 thread this way. But now, since i
split the pipe (from find to parallel) using a file and cat in between,
the warning message vanished. This is already progess. But the 24 jobs
take up the same time as one single process. And while the source tree
should be cached by now and the target resides temporarily in a tmpfs,
there is no reason to expect any other bottleneck than the CPU usage,
but obviously, the job fails to get distributed as expected. I am going
to play around some more...
And it is starting to become obvious: i should have read the man pages
more carefully!
Just the chapter "SPREADING BLOCKS OF DATA" was not clear to me before.
And checking some options, i noticed, that --pipe-part should be used,
but --block set to -1.
The latter results into an error message, showing, that there is (at
least) one bug in parallel as well:
> $ cat filelist | parallel --pipe --block -1 -j4 wc
> Negative length at /usr/local/bin/parallel line 970.
>     ...propagated at /usr/local/bin/parallel line 983.
> $ cat filelist | parallel --pipe --block 200000 -j4 wc
>    1431    1431  199989
>    1626    1626  199957
>    1491    1491  199949
>    1866    1866  199958
>     128     131   11198
So at least temporarily, i will be able to spread the data, even if i
will have to do it by hand for now, since --block -1 doesnt work as the
manual says: "--pipe-part starts one job per chunk - just like normal
--pipe. It first finds record endings near all block borders in the file
and then starts the jobs. By using --block -1 it will set the block size
to 1/n * size-of-file. Used this way it will start n jobs in total."

Thanks for nudging me forward. ;-)

Am 23.12.2020 um 01:34 schrieb John Correira:
> Have you tried using the -a option to read from filelist instead of
> using cat?
>
> John





reply via email to

[Prev in Thread] Current Thread [Next in Thread]