There are three ways in which Matlab can utilise multiple processors. Recent versions of Matlab support multithreading, however by default this is disabled on HPC as it conflicts with the job to core mapping used for resource management. Instead, there are two toolboxes which enable you to run Matlab utilising additional workers running either on the same node or other compute nodes. The toolboxes are:
Parallel and Distributed Matlab
The difference between the two is that with the Parallel Computing Toolbox a number of workers can be opened up on the same computer (i.e. if you have a dual core desktop, you can open up two workers), whereas with with Distributed Computing Server up to 32 workers can be opened across any machine in the cluster, through the LSF queuing system.
The following Matlab code is for a function that estimates pi by simulating 1 billion dart throws and is a good example of how parallel Matlab can result in faster processing.
The key line is the parfor loop. As a parfor loop it is set up to run in parallel, but in normal conditions it runs as a standard for loop.
This example can be copied from /gpfs/grace/samplescripts/MonteCarloPI.m
Parallel Computing Toolbox
The Parallel Computing Toolbox, is part of the total academic license, which entitles a user to open up a matlabpool of up to 12 workers on a single node. Because the Parallel Computing Toolbox opens up a number of workers on the same node, it is important it is important to ensure you request the appropriate number of slots. For example, if you are going to open 8 workers, then you should request 9 slots (1 for the master Matlab session, and 8 for the workers) as follows:
Xinteractive -n 9 -R 'span[ptile=9]'
Here the -n 9 requests 9 slots, and the -R 'span[ptile=9]' ensures they span the same host. If submitting to the batch queues, your job script should appear similar to the following:
#BSUB -q short-eth
#BSUB -J Matlab_job
#BSUB -oo MatJob-%J.out
#BSUB -eo MatJob-%J.err
#BSUB -n 9
#BSUB -R 'span[ptile=9]'
module add matlab/2016b
matlab -nodisplay -nodesktop -nojvm -nosplash -r my_matlab_m_file
A parallel session is started by using the matlabpool command, in this case opening up the default local configuration:
>> parpool (4)
Starting parallel pool (parpool) using the 'local' profile ... connected to 4 workers.
>> parpool (8)
Starting parallel pool (parpool) using the 'local' profile ... connected to 8 labs.
After the parallel aspect has been completed the matlabpool should be closed, with matlabpool close command:
Parallel pool using the 'local' profile is shutting down.
Distributed Computing Server
There are 32 licenses for the Distributed Computing Server, which means up to 32 workers can be opened up across the cluster.
The Distributed Computer Server allows you to open 'workers' up on other nodes through the LSF queuing system by using matlabpool. Usage is basically the same as with the Parallel Computing Toolbox, though instead of using the default local configuration for matlabpool, this time a specific LSF configuration is used which submits jobs to the queue. Because there is a master Matlab session and workers are managed by LSF, unlike the Parallel Computing Toolbox, you do not need to request additional slots.
Before trying to open the LSF pool for the first time, you need to import the LSF configuration.
- In the Matlab window, go to Parallel > Manage Cluster Profiles
- In the Cluster Profile Manager that appears (that should just have the default local config)
- Click Import
- In the Import Configuration window navigate to /gpfs/software/matlab/R2016b-DCS/
- Select LSF2016b.settings and click open.
- In the Cluster Profile Manager there should now be a LSF2016b config.
You can test the config by highlighting the LSF config and clicking "Validate". Note the validation takes a few minutes to complete, but is useful for troubleshooting if you have issues - if when validating the test stage status doesn't return four succeeded green ticks, please email email@example.com. Once imported, the configuration will be available for you on future sessions.
Matlab will report that a number of workers have been open (and can take some time to open workers and return to a Matlab prompt):
>> parpool ('LSF2016b',16)
Starting parallel pool (parpool) using the 'LSF2016b' profile ... connected to 16 workers.
Running bjobs -X outside Matlab will show you that these workers have been opened as a separate LSF job:
[s074@lgn02 ~]$ bjobs -X
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
38095 anon RUN short-eth e0003 1*e0080 Job1 Jul 6 15:32
37838 anon RUN interactiv lgn02 e0003 /bin/bash Jul 6 12:02
By default LSF configuration parallel jobs are submitted to the short-eth queue.
To close the parallel pool enter the following command: delete(gcp('nocreate'))