Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grid virtual casing #217

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

missing-user
Copy link
Contributor

@missing-user missing-user commented Dec 4, 2024

Addressing issue #32 with a faster virtual casing implementation.
I iteratively increase the resolution to get an estimate of how well the integral has converged, and could adaptively refine the grid until a given epsilon is reached. For now it just prints the error estimate and doesn't terminate if gBnstol isn't reached.

For the rotating ellipse test case at a given accuracy, it is approximately 10x faster than the current implementation (serial performance) and parallelizes trivially across both MPI ranks and OMP threads. Comparing to DCUHRE results at 1e-12 tolerance, the accuracy estimates of the new method seem accurate, and it converges exponentially to the correct result as expected.

Changes to Inputlist

  • New option Lvcgrid was added to choose which virtual casing routine is being used.
  • vcNt, vcNz the resolution parameters for the new integration routine

Possible improvements

  • Interpolate the precomputed surface currents when performing the integration, to have an "even finer grid" or possibly use an adaptive integration routine ontop of the interpolated values as @lazersos suggested.
  • Possibly reuse the computed surface currents across multiple vc iterations, and only recompute the outer integral with the $1/(r^2)$ term.

Copy link
Collaborator

@jonathanschilling jonathanschilling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks quite good to me, thanks for implementing this! Looking forward to seeing the CI results and replies to a few minor comments.

src/casing.f90 Outdated Show resolved Hide resolved
src/casing.f90 Show resolved Hide resolved
src/casing.f90 Outdated Show resolved Hide resolved
src/casing.f90 Show resolved Hide resolved
src/global.f90 Outdated Show resolved Hide resolved
src/sphdf5.f90 Outdated Show resolved Hide resolved
@jonathanschilling
Copy link
Collaborator

@smiet Could you please also have a look at this?

Copy link
Collaborator

@jonathanschilling jonathanschilling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest updates. As far as I can judge without looking at actual runs, the changes look good to me.

@missing-user
Copy link
Contributor Author

missing-user commented Dec 12, 2024

Added further performance improvements (MPI Parallelization in addition to OpenMP).

Checked correctness of datasharing using MPI_Allreduce , the communication of the gBn terms using RlBroadcast is identical to the other casing() routine and wasn't investigated in detail. Since the resutls don't change with MPI I'm quite confident in the correctness.

Logging on 2 Ranks before and after allreduce

 Entering precompute loop            1
 rank            1 : Pbxyz =    0.0000000000000000        5.0083738809939513        0.0000000000000000        5.0083108465022459     
 rank            0 : Pbxyz =    5.0083817603806757        0.0000000000000000        5.0083502429341253        0.0000000000000000     
 rank (after reduction)            0 : Pbxyz =    5.0083817603806757        5.0083738809939513        5.0083502429341253        5.0083108465022459     
 rank (after reduction)            1 : Pbxyz =    5.0083817603806757        5.0083738809939513        5.0083502429341253        5.0083108465022459     
 rank            0 : kk =            0 , jk =            1

@missing-user
Copy link
Contributor Author

Both VC implementations exploit stellerator symmetry now to save 1/2 evaluations

@missing-user
Copy link
Contributor Author

missing-user commented Dec 16, 2024

I added an example config for the axysymmetric test case to the CI @jonathanschilling , it runs in about half the time and reaches the same result as the reference implementation (within tolerance).
The expected speedup is higher the more fourier modes are used, and also with more OpenMP threads.


!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!-!
! vcintegrand( 4) = - three * vcintegrand(1) * rr(1) / distance(2) ! dBxdx; ! need to divide by distance(3) ; 14 Apr 17;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation

@jons-pf
Copy link
Collaborator

jons-pf commented Dec 17, 2024

I think, as far as I can tell without much more effort, that this is ready for merging.

@jloizu Could you maybe also have a look and potentially try out this new feature on some of the free-boundary SPEC verification cases?

@missing-user
Copy link
Contributor Author

missing-user commented Dec 17, 2024

I think, as far as I can tell without much more effort, that this is ready for merging.

@jloizu Could you maybe also have a look and potentially try out this new feature on some of the free-boundary SPEC verification cases?

Trying out the Feature should only require adding
Lvcgrid=1 to numericslist. If tolerance isnt reached, increase the resolution in globallist
vcNt = 512
vcNz = 512
should be sufficient in Most cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants