❗ The repository was restructured and cleaned up and in this process the history was rewritten. Please make fresh forks and clones. |
---|
MD-Bench is a toolbox for the performance engineering of short-range force calculation kernels on molecular-dynamics applications. It aims at covering state-of-the-art algorithms from different community codes such as LAMMPS and GROMACS.
Clone the repository from GitHub:
git clone https://github.com/RRZE-HPC/MD-Bench.git
Edit config.mk and configure the compiler toolchain
# Compiler tool chain (GCC/CLANG/ICC/ICX/ONEAPI/NVCC)
TOOLCHAIN ?= CLANG
Best supported are ICC (deprecated legacy Intel compiler) and ICX (LLVM based Intel compiler). Choose NVCC to enable CUDA GPU kernels.
The toolchain settings are located in the ./make
directory. Review the
settings for the configured toolchain. You can configure different settings in
config.mk
, for starters on a X86 based system the defaults are fine.
To build the binary call, don't forget to load the compiler module on the
NHR@FAU clusters (e.g. module load intel
):
make
While the Makefile works with any version of GNU make, some features require GNU make > v4.
You can run MD-Bench without any arguments:
./MDBench-VL-ICX-X86-AVX2-DP
MD-Bench uses a Makefile with pattern rules and automatic dependency generation.
If you add source files you do not need to change the Makefile as long as the
sources are placed either in the ./src/verletlist/
, ./src/clusterpair/
or
./src/common
directories. If you change a file, all object files that depend
on it are rebuild.
All configuration variables can be overwritten from the command line, e.g. to
build with ICC without changing ./config.mk
build with:
make TOOLCHAIN=ICC
Multiple configurations can be build at the same time. Every configuration has a
unique binary name ./MDBENCH-<build tag>
. Intermediate build results are
located in a ./build/build-<build tag>/
directory.
All make targets act on the current configuration set in ./config.mk
, but this
can be of course overwritten on the command line.
Supported make targets:
make
: Build the binary for current configuration.make clean
: Remove intermediate build results.make distclean
: Remove intermediate build results and binary. Also removes generated tags and clangd files, more on that later.make cleanall
: Remove all generated files. Note: This target applies to all configurations.make info
Output compiler version, useful for logging in automated benchmark scripts.make asm
: Generate assembly output of all source files. The assembly files are placed in the intermediate build directory.make format
: Reformat all source files withclang-format
using the format specification in.clang-format
.
TOOLCHAIN
: Determines which toolchain makefile is includedISA
: No usage apart from tag stringsSIMD
: Controls the generation of intrinsic kernels for clusterpairOPT_SCHEME
: Algorithmic variant (verletlist or clusterpair), different source directories and main routines are usedENABLE_LIKWID
: Turn on LIKWID instrumentation, the LIKWID library has to be availableDATA_TYPE
: Switch between single precision and double precision floating point. This is controlled by defines.DATA_LAYOUT
: Switch between array-of-structure (AOS) and structure-of-array (SOA) layout for atom positions and forces. Tradeoff between better cache utilisation and easier SIMD vectorization.DEBUG
: Enable additional debug outputSORT_ATOMS
: Resort atoms to ensure that atoms that are nearby are also close to each other in the data structuresONE_ATOM_TYPE
: Simulate only one atom type and do not perform table lookup for parameters.ENABLE_OMP_SIMD
: This enforces the use of#pragma omp simd
for the verletlist half-neighbour list force kernel. Without is the Intel compiler (at least ICC) refuses to do SIMD vectorization.USE_REFERENCE_VERSION
: Enforce usage of C implementation for clusterpair algorithm for validationUSE_CUDA_HOST_MEMORY
: Enable pinned host memory for faster host-device transfersENABLE_MPI:
Turn on the MPI parallel version of the code
MD-Bench currently only supports Nvidia GPUs using CUDA kernels. To enable CUDA
kernels you need to specify NVCC
as toolchain. The CUDA source code is in the
same source directories with Cuda suffix and .cu
as file type ending. If
NVCC
is set as toolchain, all supported kernels are automatically set to their
CUDA variants at build time. This means a binary either supports CPU kernels or
GPU kernels.
MD-Bench can be executed without any arguments, in this case the full neighbor list testcase with LJ force will be computed for 200 steps and a size of 32x32x32 unit cells.
-p / --params <string>
: file to read parameters from (can be specified more than once). Default initialization sets parameters for default LJ testcase. *-f <string>
: force field (lj, eam), default lj. For anything different than lj you also need to provide spcific parameter file.-i <string>
: input file with atom positions (dump). MD-Bench supports Brookhaven protein data bank (.pdb), GROMACS GROMOS87 (.gro), and LAMMPS dump (.dmp) file formats-e <string>
: input file for EAM parameters-n / --nsteps <int>
: set number of timesteps for simulation (default 200)-nx/-ny/-nz <int>
: set linear dimension of systembox in x/y/z direction (default 32 in every dimension)-half <int>
: use half (1) or full (0) neighbor lists (default 0 - full neighbor list)-r / --radius <real>
: set cutoff radius (default 2.5)-s / --skin <real>
: set skin (verlet buffer, default 0.3)-w <file>
: write input atoms to file--freq <real>
: processor frequency (GHz), used to calculate cycle metrics (default 2.4)--vtk <string>
: VTK output file for visualization
For all variants you can switch between single precision and double precision
and between AOS versus SOA data layouts using build time options. You can use
the half neighbour list algorithm instead of the default full neighbour list by
setting -half 1
. To enforce SIMD vectorization for the half neighbour list
algorithm you can set the option ENABLE_OMP_SIMD=true
.
Just start without any command line argument, this is the default testcase. You
may change the number of timesteps using the -n
options and change the problem
size using the -nz, -ny, -nz
options.
Call MD-Bench as follows:
./MDBench-<TAG> -n 400 -f eam -e ./data/Cu_u3.eam
Two different EAM variants are available: Cu_u3.eam
and Cu_u6.eam
. The EAM
potential is currently only available for verletlist.
The melted copper testcase has only 32000 atoms in the default configuration. Call MD-Bench as follows:
./MDBench-<TAG> -n 400 -i ./data/copper_melting/input_lj_cu_one_atomtype_20x20x20.dmp
Compile MD-Bench with EXPLICIT_TYPES=true
in config.mk
.
Call MD-Bench as follows:
./MDBench-<TAG> -n 400 -i ./data/copper_melting/input_lj_cu_one_atomtype_20x20x20.dmp
This testcase currently segvaults!
Call MD-Bench as follows:
./MDBench-<TAG> -n 400 -f eam -e ./data/Cu_u3.eam -i ./data/copper_melting/input_eam_cu_one_atomtype_20x20x20.dmp
Two different EAM variants are available: Cu_u3.eam
and Cu_u6.eam
. The EAM
potential is currently only available for verletlist.t.
Call MD-Bench as follows:
./MDBench-<TAG> -i ./data/argon/input.gro -p ./data/argon/mdbench_params.conf
Rafael Ravedutti Lucio Machado, Jan Eitzinger, Jan Laukemann, Georg Hager, Harald Köstler and Gerhard Wellein: MD-Bench: A performance-focused prototyping harness for state-of-the-art short-range molecular dynamics algorithms. Future Generation Computer Systems (FGCS), Volume 149, 2023, Pages 25-38, ISSN 0167-739X, DOI: https://doi.org/10.1016/j.future.2023.06.023
Rafael Ravedutti Lucio Machado, Jan Eitzinger, Harald Köstler, and Gerhard Wellein: MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms. Accepted for PPAM 2022, the 14th International Conference on Parallel Processing and Applied Mathematics, Gdansk, Poland, September 11-14, 2022. PPAM 2022 Best Paper Award. Preprint: arXiv:2207.13094, DOI: https://dl.acm.org/doi/10.1007/978-3-031-30442-2_24
MD-Bench is developed by the Erlangen National High Performance Computing Center (NHR@FAU) at the University of Erlangen-Nürnberg.