Matlab Parallel Old
Please note: With the introduction of Matlab 2013a, Mathworks have increased the number of workers available for the Distributed Computing Server. We now have licenses to cover 32 workers. Please load the matlab/2013a module and import the LSF DCS configuration file LSF2013a.settings from /gpfs/grace/matlab/2013a-DCS/
There are three ways in which Matlab can utilise multiple processors. Recent versions of Matlab support multithreading, however by default this is disabled on Grace as it conflicts with the job to core mapping used for resource management. Instead, there are two toolboxes which enable you to run Matlab utilising additional workers running either on the same node or other compute nodes. The toolboxes are:
Parallel and Distributed Matlab
The difference between the two is that with the Parallel Computing Toolbox a number of workers can be opened up on the same computer (i.e. if you have a dual core desktop, you can open up two workers), whereas with with Distributed Computing Server up to 32 workers can be opened across any machine in the cluster, through the LSF queuing system.
The following Matlab code is for a function that estimates pi by simulating 1 billion dart throws and is a good example of how parallel Matlab can result in faster processing.
The key line is the parfor loop. As a parfor loop it is set up to run in parallel, but in normal conditions it runs as a standard for loop.
This example can be copied from /gpfs/grace/samplescripts/MonteCarloPI.m
Parallel Computing Toolbox
The Parallel Computing Toolbox, is part of the total academic license, which entitles a user to open up a matlabpool of up to 12 workers on a single node. Because the Parallel Computing Toolbox opens up a number of workers on the same node, it is important it is important to ensure you request the appropriate number of slots. For example, if you are going to open 8 workers, then you should request 9 slots (1 for the master Matlab session, and 8 for the workers) as follows:
Xinteractive -n 9 -R 'span[ptile=9]'
Here the -n 9 requests 9 slots, and the -R 'span[ptile=9]' ensures they span the same host. If submitting to the batch queues, your job script should appear similar to the following:
#BSUB -q medium
#BSUB -J Matlab_job
#BSUB -oo MatJob-%J.out
#BSUB -eo MatJob-%J.err
#BSUB -n 9
#BSUB -R 'span[ptile=9]'
module add matlab/2012b
matlab -nodisplay -nojvm -nodesktop -nosplash -r my_matlab_m_file
A parallel session is started by using the matlabpool command, in this case opening up the default local configuration:
>> matlabpool open 4
Starting matlabpool using the 'local' configuration ... connected to 4 labs.
>> matlabpool open 8
Starting matlabpool using the 'local' configuration ... connected to 8 labs.
After the parallel aspect has been completed the matlabpool should be closed, with matlabpool close command:
>> matlabpool close
Sending a stop signal to all the labs ... stopped.
Distributed Computing Server
There are 32 licenses for the Distributed Computing Server, which means up to 32 workers can be opened up across the cluster.
The Distributed Computer Server allows you to open 'workers' up on other nodes through the LSF queuing system by using matlabpool. Usage is basically the same as with the Parallel Computing Toolbox, though instead of using the default local configuration for matlabpool, this time a specific LSF configuration is used which submits jobs to the queue. Because there is a master Matlab session and workers are managed by LSF, unlike the Parallel Computing Toolbox, you do not need to request additional slots.
Before trying to open the LSF pool for the first time, you need to import the LSF configuration.
- In the Matlab window, go to Parallel > Manage Configurations.
- In the Configurations Manager that appears (that should just have the default local config)
- Go to File > Import
- In the Import Configuration window navigate to /gpfs/grace/matlab/2012b-DCS/
- Select LSF.settings and click import.
- In the Configurations Manager there should now be a LSF config.
You can test the config by highlighting the LSF config and clicking "Start Validation". Note the validation takes a few minutes to complete, but is useful for troubleshooting if you have issues - if when validating the test stage status doesn't return four succeeded green ticks, please email firstname.lastname@example.org. Once imported, the configuration will be available for you on future sessions.
Matlab will report that a number of workers have been open (and can take some time to open workers and return to a Matlab prompt):
>> matlabpool open LSF 16
Starting matlabpool using the 'LSF' profile ... connected to 16 workers.
Running bjobs -X outside Matlab will show you that these workers have been opened as a separate LSF job:
$ bjobs -X
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
1454749 cc RUN interactiv login00 cn134 /bin/bash Jan 23 18:02
1454755 cc RUN medium cn134 4*cn134 Job1 Jan 23 18:17
This can be confirmed from within Matlab by using the spmd command (Single Program Multiple Data) to run the system command 'hostname' on each worker lab:
>> spmd, [labindex system('hostname')];,end
Lab 1: cn134.private.dns.zone
Lab 2: cn134.private.dns.zone
Lab 3: cn134.private.dns.zone
Lab 4: cn134.private.dns.zone
Lab 8: cn133.private.dns.zone
Lab 9: cn227.private.dns.zone
Lab 10: cn227.private.dns.zone
Lab 14: cn227.private.dns.zone
Lab 5: cn136.private.dns.zone
Lab 6: cn136.private.dns.zone
Lab 7: cn136.private.dns.zone
Lab 11: cn227.private.dns.zone
Lab 12: cn227.private.dns.zone
Lab 13: cn227.private.dns.zone
Lab 15: cn227.private.dns.zone
Lab 16: cn227.private.dns.zone
By default LSF configuration parallel jobs are submitted to the medium queue
Running tasks in parallel
An example of running the monte carlo task above in standard Matlab, on a local configuration with the Parallel Computing Toolbox, and on a LSF configuration with the Distributed Computing Server can be seen below.
Run time for the same experiment is reduced from approximately 92 seconds in standard Matlab, to 16 seconds in parallel locally, to 8 seconds in parallel over LSF.