Developed: September 2014 - December 2014
Tools used: C, OpenMP, MPI, OpenCL
As part of the High Performance Computing unit during my third year at Bristol, our task was to parallelise Lattice Boltzmann method equations through a number of different methods. We were given access to BlueCrystal, Bristol's own supercomputer in order to run our code.
In order to reduce the execution time of the base code, I had to implement and test a number of serial optimisations (such as rearranging the memory layout, rewriting certain functions, combining and reducing loops, etc.), determining which of these would produce a significant speed-up in execution time.
In different assignments throughout the course, I used OpenMP to use up to 16 cores on a single node, MPI to use up to 64 cores across 4 nodes, and OpenCL to take advantage of GPUs. Each of these technologies posed different problems, with an understanding of OpenMP's pragma directives, implementing a halo exchange for use with MPI's message exchanging, and kernel design and implementation with OpenCL.