parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: file permissions on joblog


From: Christian Meesters
Subject: Re: file permissions on joblog
Date: Thu, 28 Jul 2022 17:28:30 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0


On 7/28/22 14:56, Rob Sargent wrote:

On Jul 28, 2022, at 1:10 AM, Christian Meesters <meesters@uni-mainz.de> wrote:
Hi,

not quite. Under SLURM the jobstep starter (SLURM lingo) is "srun". You do not do ssh from job host to job host, but rather use "parallel" as a semaphore avoiding over subscription of job steps with "srun". I summarized this approach here:

https://mogonwiki.zdv.uni-mainz.de/dokuwiki/start:working_on_mogon:workflow_organization:node_local_scheduling#running_on_several_hosts (uh-oh - I need to clean up that site, many outdated sections there, but this one should still be ok)

One advantage: you can safely utilize the resources of both (or more) hosts - the master hosts and all secondaries. How much resources you require depends on your application and the work it does. Be sure to consider I/O (e.g. stage-in file to avoid random I/O with too many concurrent applications, etc.), if this is an issue for your application.

Cheers

Christian
Christian,
My use of GNU parallel does not include ssh. Rather I simply fill the slurm  node with —jobs=ncores 

That would require to have an interactive job and having ncores_per_node/threads_per_application ssh-connections, and you have to manually trigger the script. My solution is to use parallel in a SLURM-job context and avoid the synchronization step by a human, whilst offering a potential multi-node job with smp applications. It's your choice, of course.




Ole,
Is your suggestion that I should ssh back to my account and run the job?  Pretty sure 2FA will get in the way. 

Thanks to you both,
rjs



reply via email to

[Prev in Thread] Current Thread [Next in Thread]