Difference between revisions of "Rocky Slurm"

From NIMBioS
(Created page with "== Slurm == Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for Linux clusters. As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocate...")
 
Line 8: Line 8:
== Commands ==  
== Commands ==  


Slurm has many commands but the most common ones you will use to run jobs are <code>sbatch</code> and <code>srun</code>.
Slurm has many commands.  Here are a few common ones you'll use when submitting jobs:


{| class='wikitable'
{| class='wikitable'

Revision as of 18:19, 25 August 2022

Slurm

Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for Linux clusters. As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.

You can learn more information at the Slurm website.


Commands

Slurm has many commands. Here are a few common ones you'll use when submitting jobs:

srun A blocking command that submits a job in real time to the cluster.
sbatch A non-blocking command that submits a job to the queue to be run as resources allow.
squeue Shows information about queued jobs.
scancel Stop queued jobs.