A word of warning

Submitting large numbers of jobs to the cluster can have disastrous consequences if not done correctly, as one can overload the scheduler, bringing the cluster to a grinding halt.

What are array jobs ?

Array jobs allow you to create and submit a single job script, but have it run multiple times with different input datasets, and process each one in sequence.  This is useful for ‘high throughput' tasks, for example where you want to repeat a simulation with different driving data.

Taking a simple R submission script as an example:

#!/bin/bash
#SBATCH --mail-type=ALL     #Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=<username>@uea.ac.uk    # Where to send mail
#SBATCH -p compute   #Which queue to use
#SBATCH --job-name=R-test_job     #Job name
#SBATCH -o R-test-%j.out    #Standard output log
#SBATCH -e R-test-%j.err     #Standard error log
#SBATCH -t 0-20:00 # Running time of 20 hours
#set up enviroment
. /etc/profile
#run the application
module add R/3.6.1
R CMD BATCH TestRFile.R dataset1.csv

Example

If you wanted to run the simulation TestRFile.R with inputs dataset2.csv through to dataset10.csv you could create and submit a job script for each dataset.  However, by setting up an array job, you could create and submit a single script. 

The corresponding array script for the above example would look something like:

#!/bin/bash
#SBATCH --mail-type=ALL     #Mail events (NONE, BEGIN, END, FAIL, ALL, ARRAY_TASKS)
#SBATCH --mail-user=<username>@uea.ac.uk    # Where to send mail
#SBATCH -p compute   #Which queue to use
#SBATCH --job-name=R-test_job     #Job name
#SBATCH -o R-test-%A-%a.out    #Standard output log
#SBATCH -e R-test-%A-%a.err     #Standard error log
#SBATCH --array=1-10  #Array range
#SBATCH --array 1-50%10   #set number of jobs to run at the same time (10)
#SBATCH -t 0-20:00 # Running time of 20 hours
#set up environment
. /etc/profile
#run the application
module add R/3.6.1
echo "SLURM_ARRAY_TASK_ID"
R CMD BATCH TestRFile.R dataset$SLURM_ARRAY_TASK_ID.csv

  • The array is created in the job name directive by including [1-10] to represent our 10 variations
  • The error and output file have an additional %A included in the name, a variable to represent the index of the task
  • The R command is updated to include the variable $SLURM_ARRAY_TASK_ID a variable to represent the index of the task 

When the job is submitted, slurm will create 10 tasks under the single job ID.  The %A-%a and $SLURM_JOBINDEX variables will match the index of the task.

Submitting a job

The job is submitted in the same way as a normal job:

sbatch R.array.sub

SLURM's job array handling is very versatile.

Instead of providing a task range a comma-separated list of task numbers can be provided, for example, to rerun a few failed jobs from a previously completed job array as in

sbatch --array=4,8,15,16,23,42  R.array.sub

Command line options override options in the script, so those can be left unchanged.

 

Monitoring a job

Use squeue to list your active jobs, you will see 10 tasks with the same Job ID.  The tasks can be distinguished by the [index] under the Job_Name  

  • [s154@login01 ~/test]$ squeue

             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
   200423_[6-50%5]   compute R-test_j     s154 PD       0:00      1 (JobArrayTaskLimit)
          200423_1   compute R-test_j     s154  R       0:03      1 c0001
          200423_2   compute R-test_j     s154  R       0:03      1 c0001
          200423_3   compute R-test_j     s154  R       0:03      1 c0001
          200423_4   compute R-test_j     s154  R       0:03      1 c0001
          200423_5   compute R-test_j     s154  R       0:03      1 c0001

For more detail use

  • scontrol show job <JobID>
     

Cancel a job

  • scancel <JobID>  will kill all the array jobs
  • scancel <JobID>_<array number> will kill only the specified array jobs
 

Running many short tasks

While SLURM array jobs make it easy to run many similar tasks, if each task is short (seconds or even a few minutes), array jobs quickly bog down the scheduler and more time is spent managing jobs than actually doing any work for you. This also negatively impacts other users.

If you have hundreds or thousands of tasks, it is unlikely that a simple array job is the best solution. That does not mean that array jobs are not helpful in these cases, but that a little more thought needs to go into them for efficient use of the resources.

As an example let's imaging I have 5,000 runs of a program to do, with each run taking about 30 seconds to complete. Rather than running an array job with 5,000 tasks, it would be much more efficient to run 5 tasks where each completes 1,000 runs. Here's a sample script to accomplish this by combining array jobs with bash loops.

#!/bin/bash
#SBATCH --job-name=mega_array   # Job name
#SBATCH --mail-type=ALL         # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=gatorlink@ufl.edu # Where to send mail    
#SBATCH --nodes=1                   # Use one node
#SBATCH --ntasks=1                  # Run a single task
#SBATCH --mem-per-cpu=1gb           # Memory per processor
#SBATCH --time=00:10:00             # Time limit hrs:min:sec
#SBATCH --output=array_%A-%a.out    # Standard output and error log
#SBATCH --array=1-5                 # Array range
 
# This is an example script that combines array tasks with
# bash loops to process many short runs. Array jobs are convenient
# for running lots of tasks, but if each task is short, they
# quickly become inefficient, taking more time to schedule than
# they spend doing any work and bogging down the scheduler for
# all users.
 
pwd; hostname; date
 
#Set the number of runs that each SLURM task should do
PER_TASK=1000
 
# Calculate the starting and ending values for this task based
# on the SLURM task and the number of runs per task.
START_NUM=$(( ($SLURM_ARRAY_TASK_ID - 1) * $PER_TASK + 1 ))
END_NUM=$(( $SLURM_ARRAY_TASK_ID * $PER_TASK ))
 
# Print the task and run range
echo This is task $SLURM_ARRAY_TASK_ID, which will do runs $START_NUM to $END_NUM
 
# Run the loop of runs for this task.
for (( run=$START_NUM; run<=END_NUM; run++ )); do
  echo This is SLURM task $SLURM_ARRAY_TASK_ID, run number $run
  #Do your stuff here
done
date