parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to debug `parallel` crash?


From: Ole Tange
Subject: Re: How to debug `parallel` crash?
Date: Sun, 10 Jul 2022 13:04:30 +0800

On Sun, Jul 10, 2022 at 5:35 AM Nagle, Michael F
<michael.nagle@oregonstate.edu> wrote:
>
> First, I’d like to thank the developers and community for producing GNU 
> Parallel and supporting it.

Thanks. You can help by:

 • (Re-)walk through the tutorial if you have not done so in the past
year (https://www.gnu.org/software/parallel/parallel_tutorial.html)

• Give a demo at your local user group/your team/your colleagues

• Post the intro videos and the tutorial on Reddit, Mastodon,
Diaspora*, forums, blogs, Identi.ca, Google+, Twitter, Facebook,
Linkedin, and mailing lists

• Request or write a review for your favourite blog or magazine
(especially if you do something cool with GNU parallel)

• Invite me for your next conference

If you use GNU parallel for research:

• Please cite GNU parallel in you publications (use --citation)

If GNU parallel saves you money:

• (Have your company) donate to FSF or become a member
https://my.fsf.org/donate/

> I use GNU parallel for a particular part of a scientific workflow, and it 
> worked great on a previous machine. On a new machine (with many more cores), 
> I’m now having it crash sometimes and am having trouble debugging this.

If you can, you should follow:
https://www.gnu.org/software/parallel/man.html#reporting-bugs

And in your case:
https://www.gnu.org/software/parallel/man.html#bug-dependent-on-environment

In a few weeks I will have access to a 64-core AMD 512 GB server
running Ubuntu 22.04, so it should be possible to get *very* close to
the environment you experience this in.

In your case you should try:

* Can the bug be triggered reliably with multiple copies of the same
input file? Or do the input files need to be different?
* Can it be triggered by running fewer jobs in parallel?
* Can it be triggered by converting the code to `xargs -P`? (in which
case it is probably not GNU Parallel that is the root cause).

To help you think out of the box see
https://github.com/tesseract-ocr/tesseract/issues/3109 It shows
Tesseract working badly if multiple copies are run in parallel. GNU
Parallel is not the root cause, but is uncovering this.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]