Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update default value of particlefile chunks to length of pset #1788

Merged
merged 2 commits into from
Dec 5, 2024

Conversation

erikvansebille
Copy link
Member

This fixes a bug where zarr output file writing can be very slow when only one particle is released on the first day of simulation, since the default chunks was set to the number of particles to write the first time, instead of the length of the ParticleSet.

Note that the new code also aligns with the information box on output chunking at https://docs.oceanparcels.org/en/latest/examples/tutorial_parcels_structure.html#4.-Execution-and-output

This fixes a bug where zarr output file writing can be _very_ slow when only one particle is released on the first day of simulation (since the default chunksize was set to the number of particles to write the first time, instead of the length of the ParticleSet)
@erikvansebille erikvansebille marked this pull request as ready for review December 5, 2024 13:32
Copy link
Contributor

@VeckoTheGecko VeckoTheGecko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Only comment is that if someone uses repeatdt this fix isn't 100% going to work (from what I can tell) since a new pset is created and then combined with the first.

Parcels/parcels/particleset.py

Lines 1216 to 1232 in 7542e4c

if abs(time - next_prelease) < tol:
pset_new = self.__class__(
fieldset=self.fieldset,
time=time,
lon=self._repeatlon,
lat=self._repeatlat,
depth=self._repeatdepth,
pclass=self._repeatpclass,
lonlatdepth_dtype=self.particledata.lonlatdepth_dtype,
partition_function=False,
pid_orig=self._repeatpid,
**self._repeatkwargs,
)
for p in pset_new:
p.dt = dt
self.add(pset_new)
next_prelease += self.repeatdt * np.sign(dt)

@erikvansebille
Copy link
Member Author

Looks good. Only comment is that if someone uses repeatdt this fix isn't 100% going to work (from what I can tell) since a new pset is created and then combined with the first.

Yes agree, but that's exactly what the note in https://docs.oceanparcels.org/en/latest/examples/tutorial_parcels_structure.html#4.-Execution-and-output is about; it even explicitly mentions this. Users will just have to change chunks themselves (which is a good strategy anyways to improve Parcels performance)

@erikvansebille erikvansebille merged commit 5ce1347 into master Dec 5, 2024
16 checks passed
@erikvansebille erikvansebille deleted the update_pfile_chunks branch December 5, 2024 15:42
@VeckoTheGecko
Copy link
Contributor

it even explicitly mentions this

ah gotcha, stopped reading 1 sentence too early 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants