Skip to content

Commit

Permalink
Merge pull request #1494 from OceanParcels/output_chunk_explanation_t…
Browse files Browse the repository at this point in the history
…utorial

Expanding explanation of output chunks parameter
  • Loading branch information
erikvansebille committed Jan 12, 2024
2 parents 2d17966 + a6fe0a5 commit 6a14807
Showing 1 changed file with 16 additions and 1 deletion.
17 changes: 16 additions & 1 deletion docs/examples/tutorial_parcels_structure.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -395,7 +395,22 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Note the use of the `chunks` argument in the `pset.ParticleFile()` above. This controls the 'chunking' of the output file, which is a way to optimize the writing of the output file. See also [the advanced output in zarr format tutorial](https://docs.oceanparcels.org/en/latest/examples/documentation_advanced_zarr.html) for more information on this. It is worth to optimise this parameter in your runs, as it can significantly speed up the writing of the output file and thus the runtime of `pset.execution()`."
"<div class=\"alert alert-info\">\n",
"\n",
"### A note on output chunking\n",
"\n",
"Note the use of the `chunks` argument in the `pset.ParticleFile()` above. This controls the 'chunking' of the output file, which is a way to optimize the writing of the output file. The default chunking for the output in Parcels is `(number of particles in initial particleset, 1)`. \n",
"Note that this default may not be very efficient if \n",
"1. you use `repeatdt` to release a relatively small number of particles _many_ times during the simulation and/or\n",
"2. you expect to output _a lot of timesteps_ (e.g. more than 1000).\n",
"\n",
"In the first case, it is best to increase the first argument of `chunks` to 10 to 100 times the size of your initial particleset. In the second case, it is best to increase the second argument of `chunks` to 10 to 1000, depending a bit on the size of your initial particleset.\n",
"\n",
"In either case, it will generally be much more efficient if `chunks[0]*chunks[1]` is (much) greater than several thousand.\n",
"\n",
"See also [the advanced output in zarr format tutorial](https://docs.oceanparcels.org/en/latest/examples/documentation_advanced_zarr.html) for more information on this. The details will depend on the nature of the filesystem the data is being written to, so it is worth to optimise this parameter in your runs, as it can significantly speed up the writing of the output file and thus the runtime of `pset.execution()`.\n",
"\n",
"</div>"
]
},
{
Expand Down

0 comments on commit 6a14807

Please sign in to comment.