Score-P UCX data acquisition plugin
First, download, build and install the Huawei updated Score-P (currently not supported in the Score-P release),
git clone https://github.com/shuki-zanyovka/scorep_6.0
Then, download and build the UCX plugin,
git clone --recurse-submodules https://github.com/shuki-zanyovka/scorep_plugin_ucx
cd scorep_plugin_ucx
To build:
mkdir BUILD
cd BUILD
cmake ../ -DCMAKE_C_STANDARD_COMPUTED_DEFAULT=GNU -DCMAKE_CXX_STANDARD_COMPUTED_DEFAULT=GNU -DCMAKE_CXX_COMPILER=mpic++ -DCMAKE_C_COMPILER=mpicc
make
For using with OpenMPI, please review and apply all patches under ./openmpi-patches directory (already applied in HMPI).
export LOCAL_USER_SCOREP_INSTALL_PATH=<Score-P 6.0 Install Path>/scorep-6.0
export SCOREP_PLUGIN_UCX_PATH=<Plugin Path>/scorep_plugin_ucx/BUILD
export OMPI_PATH=<OpenMPI Install Path>
export UCX_INSTALL_PATH=<UCX installation path, profiling enabled>/ucx-prof
export UCS_LIB_PATH="$UCX_INSTALL_PATH/lib:$UCX_INSTALL_PATH/lib/ucx"
export PATH=$OMPI_PATH/bin:$LOCAL_USER_SCOREP_INSTALL_PATH/bin:$PATH
export LD_LIBRARY_PATH=$SCOREP_PLUGIN_UCX_PATH:$SCOREP_PLUGIN_MPI_PATH:$OMPI_PATH:$UCS_LIB_PATH:$LOCAL_USER_SCOREP_INSTALL_PATH:$LD_LIBRARY_PATH
export SCOREP_METRIC_PLUGINS="scorep_plugin_ucx"
export SCOREP_METRIC_SCOREP_PLUGIN_MPI=UCX@20
export SCOREP_ENABLE_PROFILING=false
export SCOREP_ENABLE_TRACING=true
export SCOREP_TOTAL_MEMORY=4000M
export SCOREP_FILTERING_FILE=./filter.scorep
export UCX_STATS_FILTER="rx_am*,bytes_short,bytes_bcopy,bytes_zcopy,rx*,tx*"
export UCX_STATS_DEST="udp:localhost:37873"
export SCOREP_ENABLE_TRACING=true
export SCOREP_ENABLE_PROFILING=false
export SCOREP_TOTAL_MEMORY=1000M
export UCX_STATS_DEST="udp:localhost:37873"
export UCX_STATS_FILTER="rx_am*,bytes_short,bytes_bcopy,bytes_zcopy,rx*,tx*"
mpirun -n 2 <mpi_application>
The Score-P UCX plugin uses the UCX aggregate-sum counters statistics by default - This basically means that,
- All counters values of the same class/type will be saved in the same counter in the trace.
Please note, that the user should set the environment variables as follows to collect the UCX software counters,
# Enable UCX SW counters collection.
export SCOREP_UCX_PLUGIN_UCX_COLLECTION_ENABLE=1
# Enable NIC counters collection.
export SCOREP_UCX_PLUGIN_NIC_COLLECTION_ENABLE=0
In a separate run, the user can enable collecting the aggregate-sum of the Nvidia Connectix-* NIC counters by setting the following environment variables:
# interface name, can be acquired using the 'ibdev2netdev' command as seen in the example below.
export SCOREP_UCX_PLUGIN_NIC_DEVICE_NAME="ens2f0"
# Enable UCX SW counters collection.
export SCOREP_UCX_PLUGIN_UCX_COLLECTION_ENABLE=0
# Enable NIC counters collection.
export SCOREP_UCX_PLUGIN_NIC_COLLECTION_ENABLE=1
$ ibdev2netdev
mlx5_0 port 1 ==> ens3f0 (Up)
mlx5_1 port 1 ==> ib0 (Down)
mlx5_2 port 1 ==> ens2f0 (Up)
mlx5_3 port 1 ==> ib1 (Down)
Note, that this feature is disabled by default and is only supported over HUCX: https://github.com/kunpengcompute/hucx
It can be enabled by editing the following file: src/scorep_plugin_ucx_config.h and enabling the UCX_STATS_NIC_COUNTERS_ENABLE compilation flag.