Rocky Slurm Basic Multi

From NIMBioS

About

In these examples, you'll see how to use SBATCH parameters to take advantage of multiple nodes and multiple cores on those nodes. For example purposes, we'll use a simple bash script. It's only purpose is to output the hostname of the node we're on and the cpuid the script is running. This will help illustrate that our script is being run on multiple nodes and/or multiple cores on each node.

myscript.sh

#!/bin/bash

H=`hostname -s`
C=`ps -o cpuid -h -p ${BASHPID} | xargs`

echo "${H} (${C})"


Multiple Nodes

slurm-test-nodes.sh

#!/bin/bash

#SBATCH --job-name=test_nodes_job         
#SBATCH --nodes=3                    
#SBATCH --output=test_nodes_%j.log 


srun myscript.sh

Here you can see we are telling sbatch to use 3 nodes for this job.


sbatch slurm-test-nodes.sh


moose1 (20)
moose2 (20)
rocky1 (40)

Here you can see from the output that the job used 3 nodes (moose1, moose2, and rocky1).

Multiple Tasks and Nodes

#!/bin/bash

#SBATCH --job-name=test_tasksnodes_job
#SBATCH --nodes=3
#SBATCH --ntasks=6
#SBATCH --output=log/test_%j.log 

srun myscript.sh

We've asked for 3 nodes and 6 total tasks.

moose1 (0)
moose1 (20)
moose2 (0)
moose2 (20)
rocky1 (0)
rocky1 (40)

The log file shows it broke the 6 tasks up amongst 3 nodes.

Many Tasks

#!/bin/bash

#SBATCH --job-name=test_job          # Job name
#SBATCH --ntasks=100
#SBATCH --output=log/test_%j.log     # Standard output and error log

srun myscript.sh

We've asked for 100 tasks. We have not asked for multiple nodes but this will be more tasks than can fit on a single node.


moose2 (6)
moose2 (7)
moose2 (27)
moose1 (0)
moose1 (1)
moose1 (21)
moose2 (26)
moose1 (20)
moose2 (8)
moose2 (28)
moose1 (19)
moose2 (30)
moose2 (14)
moose2 (9)
moose2 (29)
moose2 (10)
moose1 (28)
moose1 (39)
moose1 (30)
moose2 (20)
moose2 (34)
moose2 (31)
moose1 (26)
moose1 (22)
moose2 (25)
moose2 (23)
moose2 (0)
moose1 (27)
moose2 (3)
moose1 (29)
moose1 (9)
moose2 (4)
moose1 (34)
moose1 (17)
moose1 (6)
moose2 (24)
moose1 (7)
moose2 (5)
moose2 (12)
moose1 (37)
moose2 (1)
moose2 (22)
moose2 (2)
moose2 (13)
moose2 (33)
moose2 (11)
moose1 (3)
moose1 (23)
moose1 (11)
moose1 (10)
moose1 (14)
moose2 (35)
moose2 (18)
moose1 (33)
moose1 (2)
moose1 (12)
moose2 (17)
moose2 (37)
moose1 (8)
moose2 (32)
moose2 (21)
moose1 (31)
moose2 (15)
moose1 (13)
moose2 (16)
moose2 (19)
moose2 (39)
moose1 (5)
moose2 (38)
moose2 (36)
moose1 (32)
moose1 (16)
moose1 (25)
moose1 (38)
moose1 (4)
moose1 (36)
moose1 (15)
moose1 (24)
moose1 (18)
moose1 (35)
rocky1 (16)
rocky1 (56)
rocky1 (0)
rocky1 (40)
rocky1 (4)
rocky1 (44)
rocky1 (14)
rocky1 (54)
rocky1 (18)
rocky1 (58)
rocky1 (42)
rocky1 (52)
rocky1 (2)
rocky1 (12)
rocky1 (10)
rocky1 (50)
rocky1 (8)
rocky1 (48)
rocky1 (6)
rocky1 (46)

Here you can see even though we did not ask for more than one node it has split the tasks up across multiple nodes.