Array tasks allow you to create and submit a single job script, but have it run multiple times.  This is useful for ‘high throughput' tasks, for example where you want to repeat a simulation with different driving data.

Taking a simple R submission script as an example:

#!/bin/sh
#BSUB -q short
#BSUB -J R_job
#BSUB -oo R-%J.out
#BSUB -eo R-%J.err
. /etc/profile
module add R/2.15.1
Rscript TestRFile.R dataset1.csv

If you wanted to submit the same job so it ran Rscript TestRFile.R with arguments 2 through to 10

If you wanted to run the simulation TestRFile.R with inputs dataset2.csv through to dataset10.csv you could create and submit a job script for each dataset.  However, by setting up an array job, you could create and submit a single script. 

The corresponding array script for the above example would look something like:

#!/bin/sh
#BSUB -q short
#BSUB -J R_job[1-10]
#BSUB -oo R-%J-%I.out
#BSUB -eo R-%J-%I.err
. /etc/profile
module add R/2.15.1
Rscript TestRFile.R datset$LSB_JOBINDEX

Here the important differences are :

  • The array is created in the job name directive by including [1-10] to represent our 10 variations
  • The error and output file have an additional %I included in the name, a variable to represent the index of the task
  • The R command is updated to include the variable $LSB_JOBINDEX a variable to represent the index of the task 

When the job is submitted, LSF will create 10 tasks under the single job ID.  The %I and $LSB_JOBINDEX variables will match the index of the task.

The job is submitted in the same way as a normal job:

[cc@login00 ~]$ bsub < Rarray.bsub
Job <2494256> is submitted to queue .

If you use bjobs to list your active jobs, you will see 10 tasks with the same Job ID.  The tasks can be distinguished by the [index] under the Job_Name  

[cc@login00 ~]$ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
2494256 cc RUN short login00 rack01 R_job[2] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[3] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[1] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[4] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[5] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[6] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[7] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[8] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[9] Jul 9 12:27
2494256 cc RUN short login00 rack01 R_job[10] Jul 9 12:27

If you use bkill JOBID to terminate the job, all tasks within the array will be terminated.  If you wish to only terminate an individual task, you need to use bkill JOBID[INDEX]:

[cc@login00 ~]$ bkill 2494256[6]
Job <2494256[6]> is being terminated

Similarly, if you wish to examine a particular job, you need to use the same JOBID[INDEX] syntaxq, i.e. bjobs –l JOBID[INDEX]:

The array step can be adjusted in a number of ways:

  • #BSUB –J R_Job[1-10] will run tasks 1,2,3 etc up to 10
  • #BSUB –J R_Job[1-10:2] will run tasks between 1 and 10 incrementing in 2, so 1, 3, 5, 7, 9
  • #BSUB –J R_Job[1,2,5,10] will runs tasks in the list, 1, 2, 5 and 10
  • #BSUB –J R_Job[1,2,5,7-10] will run tasks 1, 2, 4, 7, 8, 9, 10

With a bit of work it is possible to create more complex variations, for example running a similar job this time with a single data set but with different start and end values for each run:

#!/bin/sh
#BSUB -q short
#BSUB -J R_job[1-2]
#BSUB -oo R-%J-%I.out
#BSUB -eo R-%J-%I.err
. /etc/profile
module add R/2.15.1

let "INDEX_START = (LSB_JOBINDEX -1 ) * 500"
let "INDEX_END = (LSB_JOBINDEX * 500 ) - 1"

Rscript TestRFile.R datset1 $INDEX_START $INDEX_END

This would result in the following being run:

  • Rscript TestRFile.R dataset1 0 499
  • Rscript TestRfile.R dataset1 500 999