Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding benchmark functionality #1711

Closed
wants to merge 3 commits into from

Conversation

CKehl
Copy link
Contributor

@CKehl CKehl commented Sep 21, 2024

merging CKehl's modifications on benchmarking simulations into the main branch

@CKehl CKehl marked this pull request as draft September 21, 2024 15:18
@VeckoTheGecko
Copy link
Contributor

Hi Christian. What is the context and motivation surrounding this PR?

If you would like to discuss our approach to benchmarking in Parcels, you’re welcome to comment in #1712 which I have just converted from draft to public.

@CKehl
Copy link
Contributor Author

CKehl commented Sep 23, 2024

Hi Christian. What is the context and motivation surrounding this PR?

If you would like to discuss our approach to benchmarking in Parcels, you’re welcome to comment in #1712 which I have just converted from draft to public.

Hi Vecko,

Well, if you're aware, I was writing several papers on benchmarking parcels to do performance optimization. For that, you need very fine-grained and long-running timing information. I have established this before, and because my current students would like to compare their simulation runtimes with parcels, I dugg out my old benchmarking code.

I am currently not sure if I actually wrap up this PR or not - you guys did a lot of changes, and I don't have the time to re-code this every 2 days you do change in main. I'd be happy to share the performance code, but my time budget is limited.

On your idea of benchmarking: it depends on what you want to achieve and who your audience is. I did the whole profiling stuff too back in 2020 with parcels. The issue is: at the detailed level I needed to have the benchmark (i.e. memory I/O times), using any profiler you're slowing down your code that much through the profiling itself that you, in the end, benchmark the time of the profiler, not the time of the simulation. With my audience in mind (i.e. Sci-Comp community), that was a no-go back then.

Yet again: it all depends on your use case and purpose. If you just want some aggregated runtime number for broad comparison, a normal timeit would suffice, as well as the python profiler. If you main goal is integration and code development, then the pytest-benchmark may be good. It's up to you. As said, for actual performance comparison on deep levels in more compute-related disciplines, this all sadly is insufficient, and for that purpose I developed this ParticleSet derived class, mich splits timing in kernel-compute time, file-I/O time, memory-I/O time, and plotting time.

Therefore, this PR is basically just an optional addition to the main branch - it doesn't conflict with your roadmap.

@VeckoTheGecko
Copy link
Contributor

because my current students would like to compare their simulation runtimes with parcels ... I don't have the time to re-code this every 2 days you do change in main ... this PR is basically just an optional addition to the main branch - it doesn't conflict with your roadmap

Sorry, I don't follow. I don't understand how this is relevant to the main Parcels codebase. Is there anything preventing you from maintaining your own fork of Parcels, or creating a package ckehl_tooling that your students can install separate to Parcels? Surely you can create a Python package and just tell your students to pip install git+https://github.com/username/repo.git@main and work from a fixed version of Parcels if you're concerned about the private API changing?

From what you mention is seems that these changes mainly benefit your students and not the Parcels community as a whole. I'll close this. I am opposed to shipping code with Parcels that isn't used by the package, and doesn't contribute to our roadmap.

Thanks for taking the time with your response :)

@CKehl
Copy link
Contributor Author

CKehl commented Sep 23, 2024

Your choice. If doing benchmarking properly, it actually is fairly embedded into main classes - your idea with a supplementary package doesn't work like what you propose. What the PR would provide is benchmarking on a fine-grained and reliable manner, without the issues of profiler slowdown, and for up-to-date parcels versions. Still, up to you - have fun.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants