Parallel Jobs

Again, similar to SGE, LSF requires a number of standard configuration settings such as queue (for parallel tasks, this would normally be on the Infiniband nodes defined as short-ib, medium-ib and long-ib) and number of slots. Please note that Grace cluster, has a mix of 12,16 and 20 core nodes.

#BSUB -q short-ib
#BSUB -n 96
#BSUB -R 'select[ib]'
#BSUB -R 'span[ptile=12]'
#BSUB -R 'cu[maxcus=1]'
#BSUB -oo vig_hpl_ibgpfs-%J.log
#BSUB -eo vig_hpl_ibgpfs-%J.log
#BSUB -J "vig_hplIB"
. /etc/profile
module add mpi/platform/pgi
mpirun MyParallelBinary

Submits to the short Infiniband queue

#BSUB -q short-ib

Requests 96 parallel job slots

#BSUB -n 96

Attempts to use all slots available on a node

#BSUB -R 'span[ptile=28]'

Attempts to use nodes on the same computational unit (i.e nodes on the same IB switch)

#BSUB -R 'cu[maxcus=1]'

Infinband Networks


There are two Infiniband networks, however unlike ESCluster, jobs can run across both networks, meaning it is possible to run large jobs. For optimal performance for smaller jobs, it is more efficient to ensure the job doesn't span multiple networks, so -R 'cu[maxcus=1]' is set.
It is recommended you use the mpi/platform module for MPI capabilities. Unlike ESCluster, you do not need to define a module for the interconnect type (other than selecting the resource with -R 'select[ib]') as this one MPI module will function on both Infiniband and standard ethernet parallel, selecting the IB if present.

Submitting Jobs

bsub < JobScriptName.bsub