Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Particle memory requirements #13849

Open
ericvmueller opened this issue Dec 5, 2024 · 6 comments
Open

Particle memory requirements #13849

ericvmueller opened this issue Dec 5, 2024 · 6 comments
Assignees

Comments

@ericvmueller
Copy link
Contributor

This case tests memory requirements with a large mesh (100^3). If it is just the mesh, I see about 1 GB of RAM usage. I then add a large number of particles (200,000). I see the RAM requirement jump up to 2.6 GB. The thing that I cant reconcile is that the .out file shows that each particle should require 924 bytes, or 143 MB for all particles - a very small fraction of the 1.6 GB increase I see.

Printing out with calls to SYSTEM_MEM_USAGE after each particle is added in part.f90 shows closer to 7.44 KB added per particle. So the bulk of the RAM increase is definitely coming from the ALLOCATE_STORAGE subroutine, but I have yet to track down which component of that is adding more memory than the .out file suggests is needed.

part_storage_1LPC.txt

@drjfloyd
Copy link
Contributor

drjfloyd commented Dec 5, 2024

Is the 100^3 mesh a factor or is the 7.44 kB seen even for say a 10^3 mesh?

@ericvmueller
Copy link
Contributor Author

ericvmueller commented Dec 5, 2024

Good question, I can check that.
I know the overall memory can go up with big meshes and many LP classes because of the AVG_DROP_DEN etc arrays, but of course that is allocated elsewhere and the contribution is comparatively small.

@mcgratta
Copy link
Contributor

mcgratta commented Dec 5, 2024

This thread sheds some light. It seems that the allocatable components of the derived types like BOUNDARY_ONE_D and BOUNDARY_PROP1 etc require a considerable amount of memory. For example, the array BOUNDARY_ONE_D(OD_INDEX)%K_S contains NWP real numbers, where NWP is the number of wall points. I assumed that there was an additional 8 bytes needed as a pointer pointing to where in memory this array lives. But there is, some speculate, about 64 bytes devoted to storing other info about this array, like its upper and lower bounds, and so on. That is quite a tax.

@ericvmueller
Copy link
Contributor Author

Okay, seems like the cost of doing business but just very noticeable for the type of particle I have where the arrays are all small. I'll leave this open for now in case inspiration strikes (because it can be pretty taxing for big vegetation cases).

FYI @drjfloyd, yup the ~7.5 kB/particle was the same on a 10^3 case with the same number of particles.

@marcosvanella
Copy link
Contributor

main.f90.txt

I took the system memory function in FDS and coded this snippet, where a 1D derived type array ALL_ARRAY of real allocatable arrays is defined, and an index derived type I_ARRAY + a single storage allocatable array of reals B is defined:

! Array struct of real allocatables:
TYPE :: ALLOC_ARRAY
    REAL(EB), ALLOCATABLE :: A(:)
END TYPE ALLOC_ARRAY
TYPE(ALLOC_ARRAY), ALLOCATABLE, DIMENSION(:), TARGET :: ALL_ARRAY

! Array of indices that point to data in real array B:
TYPE :: IND_ARRAY
    INTEGER :: IND(2)=0
END TYPE IND_ARRAY
TYPE(IND_ARRAY), ALLOCATABLE, DIMENSION(:), TARGET :: I_ARRAY
REAL(EB), ALLOCATABLE :: B(:)

To get same amount of data and access to it we can allocate:

! Allocate array of allocatable arrays, allocate entries to size M:
ALLOCATE(ALL_ARRAY(N))
DO J=1,N
   ALLOCATE(ALL_ARRAY(J)%A(M)); ALL_ARRAY(J)%A(1:M)=REAL(J,EB)
ENDDO
! Allocate array of low-high bound indices, and storage real array B:
ALLOCATE(I_ARRAY(N))
DO J=1,N
   I_ARRAY(J)%IND(1:2)=J
ENDDO
ALLOCATE(B(M*N)); B(:) = 0._EB

Note that N=1000000 is the size of the derived type arrays and M is the number of reals per entry in N. So M defined how granular ALL_ARRAY is, or how many entries are in a given ALL_ARRAY(J)%A(:) allocatable. I compiled the code with gfortran -O0 main.f90 and get the following cost for Allocation of ALL_ARRAY (1st allocation) and I_ARRAY + B (2nd allocation) as function of M:

-M = 1 (as granular as it gets):

Memory usage before allocation:    2.0480000000000000       MB
DELTA Memory usage after 1st allocation:    94.432000000000002       MB          ! About 6 times mem usage
DELTA Memory usage after 2nd allocation:    15.316000000000001       MB

-M = 10 :

Memory usage before allocation:    2.0480000000000000       MB
DELTA Memory usage after 1st allocation:    156.89200000000000       MB         ! Twice mem usage
DELTA Memory usage after 2nd allocation:    85.664000000000001       MB

-M = 100 :

 Memory usage before allocation:    2.0480000000000000       MB
 DELTA Memory usage after 1st allocation:    859.86000000000001       MB        ! 10% higher mem usage
 DELTA Memory usage after 2nd allocation:    788.94000000000005       MB

I see a fixed cost of ~70+ MB after allocation of internal allocatables, in line with what Kevin is stating (64 bytes of metadata per ALL_ARRAY(J)%A(:) allocatable). If I just allocate ALLOCATE(ALL_ARRAY(N)) I get a memory cost of 63.2 MB, 63.2 bytes per entry. This cost will be for each allocatable we use in the type.
I guess this points to being careful with using arrays of allocatables when the size of the allocatable entries is small. The next question is if the system is also moving this data around when running and how this affects performance.

@mcgratta
Copy link
Contributor

mcgratta commented Dec 6, 2024

Randy had an interesting idea on this. I will look at the feasibility of making a derived data type for thermally thin particles that would have many of the allocatable arrays hard-wired. This is kind of how droplets are handled, but we'd still need to be able to go through the solid phase thermally-thick routine to get the pyrolysis. I'll see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants