One of the greatest challenges to running parallel applications on large numbers of processors is how to handle file IO. Standard Unix IO routines are not designed with parallelism in mind, and IO overheads can grow to dominate the overall runtime. Parallel file systems are optimised for large volumes of data, but performance can be far from optimal if every process opens its own file or if all IO is funnelled through a single controller process.
This hands-on course explores a range of issues related to parallel IO. It uses ARCHER2 and its parallel Lustre file system as a platform for the exercises; however, almost all the IO concepts and performance considerations are applicable to any parallel system.
We will give a general overview of the Lustre filesystem and how parallel IO is implemented in MPI-IO since these are the routines ultimately used by many higher-level libraries such as HDF5 and NetCDF. A good understanding of the performance characteristics of MPI-IO is therefore very useful in optimising the IO performance of most parallel applications.
The course does not teach the detailed syntax of the various parallel IO libraries, but the Fortran source code provided for the benchmarking application used in the practical sessions should be useful reference material.
All attendees will be given access to ARCHER2 for the duration of the course.
Prerequisites: The course assumes an understanding of basic MPI programming in C, C++ or Fortran. Knowledge of MPI derived datatypes would be useful but not essential.
Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on.
They are also required to abide by the ARCHER2 Code of Conduct.
This is still a draft course page and the material below comes from a previous run of this course. It will be updated for this run, but is made available here for reference.
Unless otherwise indicated all material is Copyright © EPCC, The University of Edinburgh, and is only made available for private study.
- 09:30 - 09:40 : ARCHER2 Training
- 09:40 - 10:15 : Challenges of Parallel IO
- 10:15 - 10:45 : Lustre file system on ARCHER2
- 10:45 - 11:00 : Practical: Basic IO performance
- 11:00 - 11:30 : Break
- 11:30 - 12:00 : Practical: Basic IO performance (cont)
- 12:00 - 12:45 : Overview of MPI-IO
- 12:45 - 13:00 : Practical: MPI-IO performance
- 13:00 - 14:00 : Lunch
- 14:00 - 14:30 : Configuring the Lustre filesystem
- 14:30 - 15:00 : Practical: MPI-IO performance (cont)
- 15:00 - 15:30 : Higher-level parallel IO libraries
- 15:30 - 16:00 : Break
- 16:00 - 16:30 : Q&A / Finish exercises
- 16:30 : CLOSE
Here are a couple of reports on parallel IO on ARCHER2 including results from benchio. The first report was written before we knew about MPI-IO locking mode "2"; the second was written afterwards.
-
Performance of Parallel IO on the 5860-node HPE Cray EX System ARCHER2, D. Henty, presented at CUG2022, The Cray User Group, Monterey, CA, 2-5 May 2022.
-
Parallel IO on ARCHER2, EuroCC-UK Technical Report, S. Farr and D. Henty.
Slides and a video recording of the virtual tutorial containing detailed results of running benchio on ARCHER2 are available from the ARCHER2 course repository.
Unless otherwise indicated all material is Copyright © EPCC, The University of Edinburgh, and is only made available for private study.
Here is the parallel IO exercise sheet.. As explained in the sheet, source code and instructions for the IO benchmark can be found at https://github.com/davidhenty/benchio/.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.