[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
GNU Parallel Bug Reports memfree option and memory starvation
From: |
Olivier Bilodeau |
Subject: |
GNU Parallel Bug Reports memfree option and memory starvation |
Date: |
Fri, 24 Feb 2017 22:23:09 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 |
Hi,
--memfree documentation states:
> If the jobs take up very different amount of RAM, GNU parallel will
> only start as many as there is memory for. If less than size bytes
> are free, no more jobs will be started. If less than 50% size bytes
> are free, the youngest job will be killed, and put back on the queue
> to be run later.
However, without the --retries option it doesn't seem to be the case.
Python script used for memory testing (see attached use-mem.py).
Parallel command:
parallel -n1 -j 80% --memfree 2G --joblog joblog.log ./use-mem.py {} :::
{1..15}
joblog
Seq Host Starttime JobRuntime Send Receive Exitval
Signal Command
10 : 1487989705.822 99.963 0 0 -1
15 ./use-mem.py 10
5 : 1487989705.781 214.866 0 18 0
9 ./use-mem.py 5
8 : 1487989705.806 272.305 0 18 0
9 ./use-mem.py 8
9 : 1487989705.814 283.846 0 18 -1
15 ./use-mem.py 9
1 : 1487989705.744 293.559 0 18 0
0 ./use-mem.py 1
2 : 1487989705.755 293.921 0 18 0
0 ./use-mem.py 2
3 : 1487989705.764 293.913 0 18 0
0 ./use-mem.py 3
4 : 1487989705.773 294.098 0 18 0
0 ./use-mem.py 4
6 : 1487989705.789 294.106 0 18 0
0 ./use-mem.py 6
7 : 1487989705.798 294.892 0 18 0
0 ./use-mem.py 7
15 : 1487989999.631 17.871 0 0 -1
15 ./use-mem.py 15
14 : 1487989999.304 43.278 0 19 0
0 ./use-mem.py 14
13 : 1487989999.182 43.855 0 19 0
0 ./use-mem.py 13
12 : 1487989999.171 47.854 0 19 0
0 ./use-mem.py 12
11 : 1487989998.949 53.799 0 19 0
0 ./use-mem.py 11
output
seq 5 / pid 15772
seq 8 / pid 15787
seq 9 / pid 15792
seq 1 / pid 15752
seq 2 / pid 15757
seq 3 / pid 15762
seq 4 / pid 15767
seq 6 / pid 15777
seq 7 / pid 15782
seq 14 / pid 16645
seq 13 / pid 16640
seq 12 / pid 16635
seq 11 / pid 16630
Jobs were killed and not requeued.
When I added --retries, then no more failed jobs (in the logs) but I'm
pretty sure some jobs were killed and requeued.
parallel command:
parallel -n1 -j 80% --memfree 2G --joblog joblog.log --retries 3
./use-mem.py {} ::: {1..15}
joblog
Seq Host Starttime JobRuntime Send Receive Exitval
Signal Command
1 : 1487991228.329 134.463 0 17 0
0 ./use-mem.py 1
2 : 1487991228.339 136.082 0 17 0
0 ./use-mem.py 2
5 : 1487991228.368 136.065 0 17 0
0 ./use-mem.py 5
3 : 1487991228.349 138.911 0 17 0
0 ./use-mem.py 3
4 : 1487991228.358 145.251 0 17 0
0 ./use-mem.py 4
9 : 1487991362.468 256.723 0 17 0
0 ./use-mem.py 9
13 : 1487991364.479 255.554 0 18 0
0 ./use-mem.py 13
12 : 1487991364.433 255.689 0 18 0
0 ./use-mem.py 12
10 : 1487991362.489 259.118 0 18 0
0 ./use-mem.py 10
8 : 1487991362.314 263.891 0 17 0
0 ./use-mem.py 8
7 : 1487991362.383 299.328 0 17 0
0 ./use-mem.py 7
11 : 1487991618.786 43.341 0 18 0
0 ./use-mem.py 11
14 : 1487991620.233 42.884 0 18 0
0 ./use-mem.py 14
6 : 1487991618.888 52.900 0 17 0
0 ./use-mem.py 6
15 : 1487991658.350 25.601 0 18 0
0 ./use-mem.py 15
output
seq 1 / pid 3125
seq 2 / pid 3130
seq 5 / pid 3145
seq 3 / pid 3135
seq 4 / pid 3140
seq 9 / pid 4278
seq 13 / pid 4310
seq 12 / pid 4305
seq 10 / pid 4283
seq 8 / pid 4268
seq 7 / pid 4273
seq 11 / pid 4633
seq 14 / pid 4648
seq 6 / pid 4638
seq 15 / pid 4931
I would either clarify the documentation or implement the retry behavior
without --retries n.
Also, --retries doc is a bit confusing since it focuses on a remote use
case even though it clearly helps on a single machine in a memory
starvation context or any context where a job would be killed.
As a side note, it would be nice if the joblog gave information about
retried jobs.
By the way I love parallel! Been telling all my friends about it. Great
flexibility and interface!
--
Olivier Bilodeau
use-mem.py
Description: Text Data
signature.asc
Description: OpenPGP digital signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- GNU Parallel Bug Reports memfree option and memory starvation,
Olivier Bilodeau <=