On-the-fly load balancing #1269
-
Dear OceanParcels team, I've recently been using parcels and really enjoying it, congratulations and keep up the good work! I'm interested in the subject of plastic pollution and computational models, and one of the main features that peaked my interest in the framework was the MPI parallelization scheme, which i was eager to try! I created a simple benchmark for advecting a couple of particles in the south-atlantic for 90 days, and wanted to test how the parallel implementation would scale-up. I noticed a trend of increasing memory usage for longer periods of advection just like the docs stated in the MPI section, which was most likely caused by the dispersion of particles and the need for increasing chunk-sizes, which in turn causes load imbalance across the MPI process. Here are my recorded results for 10^6 particles:
I checked previous issues and PR's, and saw that the initial distribution is done through K-means, but on the 'on-the-fly' redistribution is yet to be implemented. I'm very interested in contributing and improving this great package, so i would like to test a few spatial aware load-balancing algorithms, but one of my questions is: How can i manually re-assign the group of particles that each processing unit should take in its ParticleSet? I noticed that the initial distribution is done in Thanks in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 5 replies
-
Thanks @joaoluisro, for initiating this discussion! It would be great if you could further explore on-the-fly load balancing; which is indeed something that has been on the Parcels development Wishlist for a while. I think the simplest implementation of on-the-fly load balancing would be to move the code for the Kmeans-distribution into its own method (e.g. Does that help? Can you try something with this? |
Beta Was this translation helpful? Give feedback.
-
Hi @erikvansebille, i implemented a basic prototype following your advice, with a routine in the collectionSOA object that periodically re-assigns particle-set data after a pre-determined number of iterations right after writing particle locations. It does this by gathering the whole data in the root process, and broadcasting it to the assigned process based on a method. For testing i reinstated kmeans and also kept the initialization procedure, which seemed to have its own condition. I ran some experiments and managed to get some bad termination from mpi with exit code 9, which imply incorrect memory management of my implementation. I also got a lot of I am still very unfamiliar with the codebase and overall parcels structure, so i have a couple of questions:
Thanks in advance. |
Beta Was this translation helpful? Give feedback.
-
As someone who took early stabs at optimal particle positioning, I would
also monitor the time the processes are waiting for disk reads to complete.
This strongly interacts with how the ocean data files that parcels is
reading are chunked (especially if the data is compressed). You will find
much better performance if the input files are chunked (I can send you ncks
scripts to do so). The details depend on the problem, but for the Mercator
1/12th of a degree global model I find that chunking x and y by 256 and z
by 1 works best. When you chunk, the code only reads the part of the data
file that contains the needed data. Thus if the particles tracked in each
MPI process are spatially nearby, the IO time will be reduced.
This quickly becomes a complex problem space, with the optimal partitioning
depending on how the data is stored.
Jamie Pringle
…On Tue, Dec 20, 2022 at 9:28 AM Erik van Sebille ***@***.***> wrote:
CAUTION: This email originated from outside of the University System. Do
not click links or open attachments unless you recognize the sender and
know the content is safe.
CAUTION: This email originated from outside of the University System. Do
not click links or open attachments unless you recognize the sender and
know the content is safe.
Thanks @etodt <https://github.com/etodt>, for following up. Yes, peak
memory use and (especially) processing time would be the most important
metrics.
—
Reply to this email directly, view it on GitHub
<#1269 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADBZR27WCLKJTV7VJIZSSGLWOG7CHANCNFSM6AAAAAASXJWCYU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
Thanks @joaoluisro, for initiating this discussion! It would be great if you could further explore on-the-fly load balancing; which is indeed something that has been on the Parcels development Wishlist for a while.
I think the simplest implementation of on-the-fly load balancing would be to move the code for the Kmeans-distribution into its own method (e.g.
ParticleCollectionSOA.loadbalance()
) and then calling that immediately after the code to write out particle locations inParticleSet.execute()
, with anif
-statement to test if the particles is too unbalanced? To avoid running this test too often, you could also introduce a newbalancedt
-argument topset.execute()
that only does the tes…