Difference between revisions of "Rocky Job Anatomy"
(Created page with "= Anatomy of a Rocky Job = Setting up a job to run on Rocky starts by creating or uploading your project's files to the project directory within your home directory on Rocky. These files will include the code you've written, any data files needed, and a batch file. ==== Your Code ==== Your code is what is submitted and executed on Rocky's compute nodes.<br/> It can be written in any of the languages supported by Rocky environment modules (Lmod). ==== Your Data ===...") |
|||
(6 intermediate revisions by 2 users not shown) | |||
Line 22: | Line 22: | ||
Job parameters are defined one per line and start with <code>#SBATCH</code>.<br/> | Job parameters are defined one per line and start with <code>#SBATCH</code>.<br/> | ||
All parameters have default values | All parameters have default values and are optional but most batch scripts will use some.<br/> | ||
You can view all of the sbatch options at:<br/> | |||
https://slurm.schedmd.com/sbatch.html | |||
Below is an example batch file using some of the most common options: | |||
'''my_job.run''' | '''my_job.run''' | ||
<pre> | <pre> | ||
#!/bin/bash | #!/bin/bash | ||
#SBATCH --job-name=MY_JOB ### | #SBATCH --job-name=MY_JOB ### job name | ||
#SBATCH --output=my_job_%j.out ### | #SBATCH --output=my_job_%j.out ### file to store job output | ||
#SBATCH --time=00: | #SBATCH --time=1-00:00:00 ### maximum time limit for job (Days-HH:MM:SS) | ||
#SBATCH -- | #SBATCH --mem-per-cpu=2G ### amount of memory per cpu to allocate | ||
#SBATCH -- | #SBATCH --cpus-per-task=1 ### number of cpu to allocate | ||
#SBATCH -- | #SBATCH --mail-user=me@test.com ### email address to notify | ||
#SBATCH -- | #SBATCH --mail-type=END ### send an email when the job ends | ||
module load R/4.2.1-foss-2022a | module load R/4.2.1-foss-2022a | ||
Line 71: | Line 75: | ||
scancel 2947 | scancel 2947 | ||
</pre> | </pre> | ||
Latest revision as of 15:04, 12 September 2023
Anatomy of a Rocky Job
Setting up a job to run on Rocky starts by creating or uploading your project's files to the project directory within your home directory on Rocky. These files will include the code you've written, any data files needed, and a batch file.
Your Code
Your code is what is submitted and executed on Rocky's compute nodes.
It can be written in any of the languages supported by Rocky environment modules (Lmod).
Your Data
If your job will be processing data, you'll need to upload that data to your project's directory.
Your home directory is shared amongst all compute nodes. No matter which node your job is assigned, it will have access to your data.
Batch Script
The batch script is a shell script that brings everything together by defining job parameters, loading any environment modules needed, and finally executing your code.
Job parameters are defined one per line and start with #SBATCH
.
All parameters have default values and are optional but most batch scripts will use some.
You can view all of the sbatch options at:
https://slurm.schedmd.com/sbatch.html
Below is an example batch file using some of the most common options:
my_job.run
#!/bin/bash #SBATCH --job-name=MY_JOB ### job name #SBATCH --output=my_job_%j.out ### file to store job output #SBATCH --time=1-00:00:00 ### maximum time limit for job (Days-HH:MM:SS) #SBATCH --mem-per-cpu=2G ### amount of memory per cpu to allocate #SBATCH --cpus-per-task=1 ### number of cpu to allocate #SBATCH --mail-user=me@test.com ### email address to notify #SBATCH --mail-type=END ### send an email when the job ends module load R/4.2.1-foss-2022a Rscript my_code.R
Running Job
Submitting Job
Jobs are submitted using the sbatch
command and passed your batch script as a parameter. This will add your job to the queue.
sbatch my_job.run
Watching Job
While your job is in the queue or being executed you may see it's status using the squeue
command. If the job is currently running it will show which node(s) it is assigned.
[test_user@rocky7 ~]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2947 compute_all my_job test_use R 0:05 1 moose1
Cancelling Job
To cancel a job, use the scancel
command and pass the JOBID (as returned by squeue
).
scancel 2947