You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 31, 2022. It is now read-only.
This is a thread to discuss performance result from our machine.
16 x Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz
128 GB Ram
Red Hat Enterprise Linux Server release 6.5 (Santiago)
gcc (GCC) 5.3.0
All test ran on a single node
(OUTDATED) - Newer result is available in the post below
Single threaded performance : 5-7% degradation
We see slight performance degradation in smaller size message. This is probably because of the atomics macro we used still doing atomics even in single threaded case. Probably @hjelmn 's pull request open-mpi/ompi#1911 will solve it.
Edit : The PR just got merged recently, newer result in a post below.
Edit : Master tested in these graph is at git hash : open-mpi/ompi@9807a6d
Multithreaded Performance : Big improvement Command line : mpirun -np 2 -mca pml ob1 -mca btl vader,self --bind-to socket ./mr_th_nb -S -t x -s size
We see some performance gain over 1.10.3. While I'm not so sure why we have this gain even we're binding the threads to socket, this does look promising.
When I increase the thread number to 8, 1.10.3 can't run to completion at all but 2.0.0 and master still can.
@artpol84 Please see my command line args and see if I'm doing it correctly.
My lousy run script :
#!/bin/bash
OPT_DIR=$HOME/opt
declare -a imp=("1.10.3" "2.0.0" "master")
for MPI in "${imp[@]}"
do
rm $OPT_DIR/mpi
ln -s $OPT_DIR/ompi/$MPI/fast $OPT_DIR/mpi
echo "Created new MPI symbol for $OPT_DIR/ompi/$MPI/fast"
make clean >/dev/null
make > /dev/null
echo "Recompiled the benchmark, start testing ..."
let "pow = 1";
for i in {1..20};
do
let "pow*=2"
mpirun -np 2 -mca btl vader,self -mca pml ob1 --bind-to socket ./mr_th_nb -s $pow -S -t $1 >> $MPI.$1t.result
done
done
The text was updated successfully, but these errors were encountered:
Single threaded case : Looks better 😄
Master and 1.10.3 seems to be on par now. Thank you @hjelmn .
Multithreaded case
I realized that I missed -b for finebinding on my first post. I added the flag and retest. It does look almost the same as the last graph.
Command : mpirun -np 2 -mca btl vader,self -mca pml ob1 --bind-to socket ./mr_th_nb -s $pow -S -t $1 -b
This is a thread to discuss performance result from our machine.
All test ran on a single node
(OUTDATED) - Newer result is available in the post below
Single threaded performance : 5-7% degradation
We see slight performance degradation in smaller size message. This is probably because of the atomics macro we used still doing atomics even in single threaded case. Probably @hjelmn 's pull request open-mpi/ompi#1911 will solve it.
Edit : The PR just got merged recently, newer result in a post below.
Edit : Master tested in these graph is at git hash : open-mpi/ompi@9807a6d
Multithreaded Performance : Big improvement
Command line : mpirun -np 2 -mca pml ob1 -mca btl vader,self --bind-to socket ./mr_th_nb -S -t x -s size
We see some performance gain over 1.10.3. While I'm not so sure why we have this gain even we're binding the threads to socket, this does look promising.
When I increase the thread number to 8, 1.10.3 can't run to completion at all but 2.0.0 and master still can.
@artpol84 Please see my command line args and see if I'm doing it correctly.
My lousy run script :
The text was updated successfully, but these errors were encountered: