diff --git a/CHANGELOG.md b/CHANGELOG.md index 224d80839858..07fa749a5531 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,7 @@ ### Added (new features/APIs/variables/...) - [[PR 998]](https://github.com/parthenon-hpc-lab/parthenon/pull/998) tensor indices added to sparse pack +- [[PR 999]](https://github.com/parthenon-hpc-lab/parthenon/pull/999) Add a post-initialization hook - [[PR 987]](https://github.com/parthenon-hpc-lab/parthenon/pull/987) New tasking infrastructure and capabilities - [[PR 969]](https://github.com/parthenon-hpc-lab/parthenon/pull/969) New macro-based auto-naming of profiling regions and kernels - [[PR 981]](https://github.com/parthenon-hpc-lab/parthenon/pull/981) Add IndexSplit @@ -20,6 +21,7 @@ - [[PR978]](https://github.com/parthenon-hpc-lab/parthenon/pull/978) remove erroneous sparse check ### Infrastructure (changes irrelevant to downstream codes) +- [[PR 990]](https://github.com/parthenon-hpc-lab/parthenon/pull/990) Partial refactor of HDF5 I/O code for readability/extendability - [[PR 982]](https://github.com/parthenon-hpc-lab/parthenon/pull/982) add some gut check testing for parthenon-VIBE ### Removed (removing behavior/API/varaibles/...) diff --git a/doc/sphinx/src/README.rst b/doc/sphinx/src/README.rst index 521cabf3bc03..96f7258bf33a 100644 --- a/doc/sphinx/src/README.rst +++ b/doc/sphinx/src/README.rst @@ -178,6 +178,8 @@ Mesh ^^^^ - ``InitUserMeshData`` +- ``ProblemGenerator`` +- ``PostInitialization`` - ``PreStepUserWorkInLoop`` - ``PostStepUserWorkInLoop`` - ``UserWorkAfterLoop`` @@ -188,6 +190,7 @@ MeshBlock - ``InitApplicationMeshBlockData`` - ``InitMeshBlockUserData`` - ``ProblemGenerator`` +- ``PostInitialization`` - ``UserWorkBeforeOutput`` To redefine these functions, the user sets the respective function @@ -195,12 +198,12 @@ pointers in the ApplicationInput member app_input of the ParthenonManager class prior to calling ``ParthenonInit``. This is demonstrated in the ``main()`` functions in the examples. -Note that the ``ProblemGenerator``\ s of ``Mesh`` and ``MeshBlock`` are -mutually exclusive. Moreover, the ``Mesh`` one requires -``parthenon/mesh/pack_size=-1`` during initialization, i.e., all blocks -on a rank need to be in a single pack. This allows to use MPI reductions -inside the function, for example, to globally normalize quantities. The -``parthenon/mesh/pack_size=-1`` exists only during problem +Note that the ``ProblemGenerator``\ s (and ``PostInitialization``\ s) of +``Mesh`` and ``MeshBlock`` are mutually exclusive. Moreover, the ``Mesh`` +ones requires ``parthenon/mesh/pack_size=-1`` during initialization, i.e., +all blocks on a rank need to be in a single pack. This allows to use MPI +reductions inside the function, for example, to globally normalize quantities. +The ``parthenon/mesh/pack_size=-1`` exists only during problem inititalization, i.e., simulations can be restarted with an arbitrary ``pack_size``. For an example of the ``Mesh`` version, see the `Poisson example `__. diff --git a/doc/sphinx/src/interface/state.rst b/doc/sphinx/src/interface/state.rst index e0760d4b8ff1..6a4cfaf840e1 100644 --- a/doc/sphinx/src/interface/state.rst +++ b/doc/sphinx/src/interface/state.rst @@ -113,12 +113,19 @@ several useful features and functions. if set (defaults to ``nullptr`` an therefore a no-op) to print diagnostics after the time-integration advance - ``void UserWorkBeforeLoopMesh(Mesh *, ParameterInput *pin, SimTime - &tm)`` performs a per-package, mesh-wide calculation after the mesh - has been generated, and problem generators called, but before any - time evolution. This work is done both on first initialization and - on restart. If you would like to avoid doing the work upon restart, - you can check for the const ``is_restart`` member field of the ``Mesh`` - object. + &tm)`` performs a per-package, mesh-wide calculation after (1) the mesh + has been generated, (2) problem generators are called, and (3) comms + are executed, but before any time evolution. This work is done both on + first initialization and on restart. If you would like to avoid doing the + work upon restart, you can check for the const ``is_restart`` member + field of the ``Mesh`` object. It is worth making a clear distinction + between ``UserWorkBeforeLoopMesh`` and ``ApplicationInput``s + ``PostInitialization``. ``PostInitialization`` is very much so tied to + initialization, and will not be called upon restarts. ``PostInitialization`` + is also carefully positioned after ``ProblemGenerator`` and before + ``PreCommFillDerived`` (and hence communications). In practice, when + additional granularity is required inbetween initialization and communication, + ``PostInitialization`` may be the desired hook. The reasoning for providing ``FillDerived*`` and ``EstimateTimestep*`` function pointers appropriate for usage with both ``MeshData`` and diff --git a/doc/sphinx/src/outputs.rst b/doc/sphinx/src/outputs.rst index c01c957fdcf5..170c83ebac7e 100644 --- a/doc/sphinx/src/outputs.rst +++ b/doc/sphinx/src/outputs.rst @@ -46,6 +46,7 @@ look like file_type = hdf5 + write_xdmf = true # Determines whether xdmf annotations are output # nonexistent variables/swarms are ignored variables = density, velocity, & # comments are still ok energy # notice the & continuation character diff --git a/doc/sphinx/src/parthenon_manager.rst b/doc/sphinx/src/parthenon_manager.rst index 15211e8bccdb..e0bac60476d3 100644 --- a/doc/sphinx/src/parthenon_manager.rst +++ b/doc/sphinx/src/parthenon_manager.rst @@ -31,12 +31,16 @@ runtimes. The function Calls the ``Initialize(ParameterInput *pin)`` function of all packages to be utilized and creates the grid hierarchy, including the ``Mesh`` -and ``MeshBlock`` objects, and calls the ``ProblemGenerator`` -initialization routines. +and ``MeshBlock`` objects, and calls the ``ProblemGenerator`` (and +``PostInitialization``) routines. The reason these functions are split out is to enable decisions to be made by the application between reading the input deck and setting up -the grid. For example, a common use-case is: +the grid. For example, during problem initialization, ``ProblemGenerator`` +may be used to be the user-facing API to describe initial conditions, +whereas, ``PostInitialization`` could use those user-specified fields +to sync *all* fields prior to entering communication routines. A common +use-case is: .. code:: cpp @@ -53,13 +57,14 @@ the grid. For example, a common use-case is: if (manager_status == ParthenonStatus::error) { pman.ParthenonFinalize(); return 1; - } + } // Redefine parthenon defaults pman.app_input->ProcessPackages = MyProcessPackages; std::string prob = pman.pin->GetString("app", "problem"); if (prob == "problem1") { pman.app_input->ProblemGenerator = Problem1Generator; + pman.app_input->PostInitialization = Problem1PostInitialization; } else { pman.app_input->ProblemGenerator = Problem2Generator; } diff --git a/src/application_input.hpp b/src/application_input.hpp index a9ec96c1551e..57e1c1962752 100644 --- a/src/application_input.hpp +++ b/src/application_input.hpp @@ -1,5 +1,5 @@ //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -36,6 +36,8 @@ struct ApplicationInput { std::function InitUserMeshData = nullptr; std::function *)> MeshProblemGenerator = nullptr; + std::function *)> MeshPostInitialization = + nullptr; std::function PreStepMeshUserWorkInLoop = nullptr; @@ -57,6 +59,7 @@ struct ApplicationInput { InitApplicationMeshBlockData = nullptr; std::function InitMeshBlockUserData = nullptr; std::function ProblemGenerator = nullptr; + std::function PostInitialization = nullptr; std::function MeshBlockUserWorkBeforeOutput = nullptr; }; diff --git a/src/basic_types.hpp b/src/basic_types.hpp index 4f827b1c05e0..b36383aa2827 100644 --- a/src/basic_types.hpp +++ b/src/basic_types.hpp @@ -1,5 +1,5 @@ //======================================================================================== -// (C) (or copyright) 2021-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2021-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -49,7 +49,7 @@ enum class TaskStatus { complete, incomplete, iterate }; enum class AmrTag : int { derefine = -1, same = 0, refine = 1 }; enum class RefinementOp_t { Prolongation, Restriction, None }; - +enum class CellLevel : int { coarse = -1, same = 0, fine = 1 }; // JMM: Not clear this is the best place for this but it minimizes // circular dependency nonsense. constexpr int NUM_BNDRY_TYPES = 10; @@ -152,16 +152,17 @@ inline std::vector GetTopologicalElements(TopologicalType tt if (tt == TT::Face) return {TE::F1, TE::F2, TE::F3}; return {TE::CC}; } + using TE = TopologicalElement; // Returns one if the I coordinate of el is offset from the zone center coordinates, // and zero otherwise -KOKKOS_INLINE_FUNCTION int TopologicalOffsetI(TE el) noexcept { +inline constexpr int TopologicalOffsetI(TE el) { return (el == TE::F1 || el == TE::E2 || el == TE::E3 || el == TE::NN); } -KOKKOS_INLINE_FUNCTION int TopologicalOffsetJ(TE el) noexcept { +inline constexpr int TopologicalOffsetJ(TE el) { return (el == TE::F2 || el == TE::E3 || el == TE::E1 || el == TE::NN); } -KOKKOS_INLINE_FUNCTION int TopologicalOffsetK(TE el) noexcept { +inline constexpr int TopologicalOffsetK(TE el) { return (el == TE::F3 || el == TE::E2 || el == TE::E1 || el == TE::NN); } diff --git a/src/coordinates/uniform_cartesian.hpp b/src/coordinates/uniform_cartesian.hpp index b10cbe0ad06c..51b06467fae3 100644 --- a/src/coordinates/uniform_cartesian.hpp +++ b/src/coordinates/uniform_cartesian.hpp @@ -1,5 +1,5 @@ //======================================================================================== -// (C) (or copyright) 2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2023-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -164,18 +164,11 @@ class UniformCartesian { template KOKKOS_FORCEINLINE_FUNCTION Real X(const int idx) const { - using TE = TopologicalElement; - [[maybe_unused]] bool constexpr X1EDGE = - el == TE::F1 || el == TE::E2 || el == TE::E3 || el == TE::NN; - [[maybe_unused]] bool constexpr X2EDGE = - el == TE::F2 || el == TE::E3 || el == TE::E1 || el == TE::NN; - [[maybe_unused]] bool constexpr X3EDGE = - el == TE::F3 || el == TE::E1 || el == TE::E2 || el == TE::NN; - if constexpr (dir == X1DIR && X1EDGE) { + if constexpr (dir == X1DIR && TopologicalOffsetI(el)) { return xmin_[dir - 1] + idx * dx_[dir - 1]; // idx - 1/2 - } else if constexpr (dir == X2DIR && X2EDGE) { + } else if constexpr (dir == X2DIR && TopologicalOffsetJ(el)) { return xmin_[dir - 1] + idx * dx_[dir - 1]; // idx - 1/2 - } else if constexpr (dir == X3DIR && X3EDGE) { + } else if constexpr (dir == X3DIR && TopologicalOffsetK(el)) { return xmin_[dir - 1] + idx * dx_[dir - 1]; // idx - 1/2 } else { return xmin_[dir - 1] + (idx + 0.5) * dx_[dir - 1]; // idx diff --git a/src/mesh/domain.hpp b/src/mesh/domain.hpp index 39fe63aea704..49ec1a37af67 100644 --- a/src/mesh/domain.hpp +++ b/src/mesh/domain.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -23,6 +23,7 @@ #include #include +#include "basic_types.hpp" #include "defs.hpp" namespace parthenon { diff --git a/src/mesh/mesh.cpp b/src/mesh/mesh.cpp index 7115a0cde719..85fce0f8322e 100644 --- a/src/mesh/mesh.cpp +++ b/src/mesh/mesh.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -156,6 +156,9 @@ Mesh::Mesh(ParameterInput *pin, ApplicationInput *app_in, Packages_t &packages, if (app_in->MeshProblemGenerator != nullptr) { ProblemGenerator = app_in->MeshProblemGenerator; } + if (app_in->MeshPostInitialization != nullptr) { + PostInitialization = app_in->MeshPostInitialization; + } if (app_in->PreStepMeshUserWorkInLoop != nullptr) { PreStepUserWorkInLoop = app_in->PreStepMeshUserWorkInLoop; } @@ -930,6 +933,10 @@ void Mesh::Initialize(bool init_problem, ParameterInput *pin, ApplicationInput * PARTHENON_REQUIRE_THROWS( !(ProblemGenerator != nullptr && block_list[0]->ProblemGenerator != nullptr), "Mesh and MeshBlock ProblemGenerators are defined. Please use only one."); + PARTHENON_REQUIRE_THROWS( + !(PostInitialization != nullptr && + block_list[0]->PostInitialization != nullptr), + "Mesh and MeshBlock PostInitializations are defined. Please use only one."); // Call Mesh ProblemGenerator if (ProblemGenerator != nullptr) { @@ -946,6 +953,23 @@ void Mesh::Initialize(bool init_problem, ParameterInput *pin, ApplicationInput * pmb->ProblemGenerator(pmb.get(), pin); } } + + // Call Mesh PostInitialization + if (PostInitialization != nullptr) { + PARTHENON_REQUIRE(num_partitions == 1, + "Mesh PostInitialization requires parthenon/mesh/pack_size=-1 " + "during first initialization."); + + auto &md = mesh_data.GetOrAdd("base", 0); + PostInitialization(this, pin, md.get()); + // Call individual MeshBlock PostInitialization + } else { + for (int i = 0; i < nmb; ++i) { + auto &pmb = block_list[i]; + pmb->PostInitialization(pmb.get(), pin); + } + } + std::for_each(block_list.begin(), block_list.end(), [](auto &sp_block) { sp_block->SetAllVariablesToInitialized(); }); } @@ -1202,6 +1226,22 @@ int Mesh::GetNumberOfMeshBlockCells() const { } const RegionSize &Mesh::GetBlockSize() const { return base_block_size; } +const IndexShape &Mesh::GetLeafBlockCellBounds(CellLevel level) const { + // TODO(JMM): Luke this is for your Metadata::fine stuff. + PARTHENON_DEBUG_REQUIRE(level != CellLevel::fine, + "Currently no access to finer cellbounds"); + MeshBlock *pmb = block_list[0].get(); + if (level == CellLevel::same) { + return pmb->cellbounds; + // TODO(JMM): + // } else if (level == CellLevel::fine) { + // return pmb->fine_cellbounds; + // } + } else { // if (level == CellLevel::coarse) { + return pmb->c_cellbounds; + } +} + // Functionality re-used in mesh constructor void Mesh::RegisterLoadBalancing_(ParameterInput *pin) { #ifdef MPI_PARALLEL // JMM: Not sure this ifdef is needed diff --git a/src/mesh/mesh.hpp b/src/mesh/mesh.hpp index bf21853d3a82..1693f019e4f9 100644 --- a/src/mesh/mesh.hpp +++ b/src/mesh/mesh.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -98,6 +98,7 @@ class Mesh { int GetNumberOfMeshBlockCells() const; const RegionSize &GetBlockSize() const; RegionSize GetBlockSize(const LogicalLocation &loc) const; + const IndexShape &GetLeafBlockCellBounds(CellLevel level = CellLevel::same) const; // data bool modified; @@ -160,6 +161,8 @@ class Mesh { // defined in either the prob file or default_pgen.cpp in ../pgen/ std::function *)> ProblemGenerator = nullptr; + std::function *)> PostInitialization = + nullptr; static void UserWorkAfterLoopDefault(Mesh *mesh, ParameterInput *pin, SimTime &tm); // called in main loop std::function UserWorkAfterLoop = @@ -194,6 +197,20 @@ class Mesh { std::vector GetNbList() const noexcept { return nblist; } std::vector GetLocList() const noexcept { return loclist; } + // TODO(JMM): Put in implementation file? + auto GetLevelsAndLogicalLocationsFlat() const noexcept { + std::vector levels, logicalLocations; + levels.reserve(nbtotal); + logicalLocations.reserve(nbtotal * 3); + for (const auto &loc : loclist) { + levels.push_back(loc.level() - GetRootLevel()); + logicalLocations.push_back(loc.lx1()); + logicalLocations.push_back(loc.lx2()); + logicalLocations.push_back(loc.lx3()); + } + return std::make_pair(levels, logicalLocations); + } + void OutputMeshStructure(const int dim, const bool dump_mesh_structure = true); // Ordering here is important to prevent deallocation of pools before boundary diff --git a/src/mesh/meshblock.cpp b/src/mesh/meshblock.cpp index e2d7af7c6ac0..409f29b1cd57 100644 --- a/src/mesh/meshblock.cpp +++ b/src/mesh/meshblock.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -115,6 +115,12 @@ void MeshBlock::Initialize(int igid, int ilid, LogicalLocation iloc, } else if (app_in->MeshProblemGenerator == nullptr) { ProblemGenerator = &ProblemGeneratorDefault; } + if (app_in->PostInitialization != nullptr) { + PostInitialization = app_in->PostInitialization; + // Only set default post-init when no mesh post-init is set + } else if (app_in->MeshPostInitialization == nullptr) { + PostInitialization = &PostInitializationDefault; + } if (app_in->MeshBlockUserWorkBeforeOutput != nullptr) { UserWorkBeforeOutput = app_in->MeshBlockUserWorkBeforeOutput; } diff --git a/src/mesh/meshblock.hpp b/src/mesh/meshblock.hpp index 3df86d7f73e7..a7b16b9955ac 100644 --- a/src/mesh/meshblock.hpp +++ b/src/mesh/meshblock.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -440,7 +440,9 @@ class MeshBlock : public std::enable_shared_from_this { // defined in either the prob file or default_pgen.cpp in ../pgen/ static void ProblemGeneratorDefault(MeshBlock *pmb, ParameterInput *pin); + static void PostInitializationDefault(MeshBlock *pmb, ParameterInput *pin); std::function ProblemGenerator = nullptr; + std::function PostInitialization = nullptr; static pMeshBlockApplicationData_t InitApplicationMeshBlockDataDefault(MeshBlock *, ParameterInput *pin); std::function diff --git a/src/outputs/output_utils.cpp b/src/outputs/output_utils.cpp index c6d05451ea29..64c099d85617 100644 --- a/src/outputs/output_utils.cpp +++ b/src/outputs/output_utils.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2023 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -100,6 +100,71 @@ AllSwarmInfo::AllSwarmInfo(BlockList_t &block_list, } } +// Tools that can be shared accross Output types + +std::vector ComputeXminBlocks(Mesh *pm) { + return FlattenBlockInfo(pm, pm->ndim, + [=](MeshBlock *pmb, std::vector &data, int &i) { + auto xmin = pmb->coords.GetXmin(); + data[i++] = xmin[0]; + if (pm->ndim > 1) { + data[i++] = xmin[1]; + } + if (pm->ndim > 2) { + data[i++] = xmin[2]; + } + }); +} + +std::vector ComputeLocs(Mesh *pm) { + return FlattenBlockInfo( + pm, 3, [=](MeshBlock *pmb, std::vector &locs, int &i) { + locs[i++] = pmb->loc.lx1(); + locs[i++] = pmb->loc.lx2(); + locs[i++] = pmb->loc.lx3(); + }); +} + +std::vector ComputeIDsAndFlags(Mesh *pm) { + return FlattenBlockInfo(pm, 5, + [=](MeshBlock *pmb, std::vector &data, int &i) { + data[i++] = pmb->loc.level(); + data[i++] = pmb->gid; + data[i++] = pmb->lid; + data[i++] = pmb->cnghost; + data[i++] = pmb->gflag; + }); +} + +// TODO(JMM): I could make this use the other loop +// functionality/high-order functions. but it was more code than this +// for, I think, little benefit. +void ComputeCoords(Mesh *pm, bool face, const IndexRange &ib, const IndexRange &jb, + const IndexRange &kb, std::vector &x, std::vector &y, + std::vector &z) { + const int nx1 = ib.e - ib.s + 1; + const int nx2 = jb.e - jb.s + 1; + const int nx3 = kb.e - kb.s + 1; + const int num_blocks = pm->block_list.size(); + x.resize((nx1 + face) * num_blocks); + y.resize((nx2 + face) * num_blocks); + z.resize((nx3 + face) * num_blocks); + std::size_t idx_x = 0, idx_y = 0, idx_z = 0; + + // note relies on casting of bool to int + for (auto &pmb : pm->block_list) { + for (int i = ib.s; i <= ib.e + face; ++i) { + x[idx_x++] = face ? pmb->coords.Xf<1>(i) : pmb->coords.Xc<1>(i); + } + for (int j = jb.s; j <= jb.e + face; ++j) { + y[idx_y++] = face ? pmb->coords.Xf<2>(j) : pmb->coords.Xc<2>(j); + } + for (int k = kb.s; k <= kb.e + face; ++k) { + z[idx_z++] = face ? pmb->coords.Xf<3>(k) : pmb->coords.Xc<3>(k); + } + } +} + // TODO(JMM): may need to generalize this std::size_t MPIPrefixSum(std::size_t local, std::size_t &tot_count) { std::size_t out = 0; diff --git a/src/outputs/output_utils.hpp b/src/outputs/output_utils.hpp index c42a235d6a85..db6353090cbf 100644 --- a/src/outputs/output_utils.hpp +++ b/src/outputs/output_utils.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2023 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -29,8 +29,10 @@ #include // Parthenon +#include "basic_types.hpp" #include "interface/metadata.hpp" #include "interface/variable.hpp" +#include "mesh/domain.hpp" #include "mesh/mesh.hpp" #include "mesh/meshblock.hpp" #include "utils/error_checking.hpp" @@ -53,6 +55,7 @@ struct VarInfo { bool is_sparse; bool is_vector; std::vector component_labels; + int Size() const { return nx6 * nx5 * nx4 * nx3 * nx2 * nx1; } VarInfo() = delete; @@ -200,6 +203,59 @@ struct AllSwarmInfo { bool is_restart); }; +template +std::vector FlattenBlockInfo(Mesh *pm, int shape, Function_t f) { + const int num_blocks_local = static_cast(pm->block_list.size()); + std::vector data(shape * num_blocks_local); + int i = 0; + for (auto &pmb : pm->block_list) { + f(pmb.get(), data, i); + } + return data; +} + +// mirror must be provided because copying done externally +template +void PackOrUnpackVar(MeshBlock *pmb, Variable *pvar, bool do_ghosts, idx_t &idx, + std::vector &data, Function_t f) { + const auto &Nt = pvar->GetDim(6); + const auto &Nu = pvar->GetDim(5); + const auto &Nv = pvar->GetDim(4); + const IndexDomain domain = (do_ghosts ? IndexDomain::entire : IndexDomain::interior); + IndexRange kb, jb, ib; + if (pvar->metadata().Where() == MetadataFlag(Metadata::Cell)) { + kb = pmb->cellbounds.GetBoundsK(domain); + jb = pmb->cellbounds.GetBoundsJ(domain); + ib = pmb->cellbounds.GetBoundsI(domain); + // TODO(JMM): Add topological elements here + } else { // metadata none + kb = {0, pvar->GetDim(3) - 1}; + jb = {0, pvar->GetDim(2) - 1}; + ib = {0, pvar->GetDim(1) - 1}; + } + for (int t = 0; t < Nt; ++t) { + for (int u = 0; u < Nu; ++u) { + for (int v = 0; v < Nv; ++v) { + for (int k = kb.s; k <= kb.e; ++k) { + for (int j = jb.s; j <= jb.e; ++j) { + for (int i = ib.s; i <= ib.e; ++i) { + f(idx, t, u, v, k, j, i); + idx++; + } + } + } + } + } + } +} + +void ComputeCoords(Mesh *pm, bool face, const IndexRange &ib, const IndexRange &jb, + const IndexRange &kb, std::vector &x, std::vector &y, + std::vector &z); +std::vector ComputeXminBlocks(Mesh *pm); +std::vector ComputeLocs(Mesh *pm); +std::vector ComputeIDsAndFlags(Mesh *pm); + // TODO(JMM): Potentially unsafe if MPI_UNSIGNED_LONG_LONG isn't a size_t // however I think it's probably safe to assume we'll be on systems // where this is the case? diff --git a/src/outputs/outputs.cpp b/src/outputs/outputs.cpp index dffabd7eb63f..e74638e15ad3 100644 --- a/src/outputs/outputs.cpp +++ b/src/outputs/outputs.cpp @@ -1,6 +1,6 @@ //======================================================================================== // Parthenon performance portable AMR framework -// Copyright(C) 2020-2023 The Parthenon collaboration +// Copyright(C) 2020-2024 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== // Athena++ astrophysical MHD code @@ -263,6 +263,7 @@ Outputs::Outputs(Mesh *pm, ParameterInput *pin, SimTime *tm) { num_rst_outputs++; } #ifdef ENABLE_HDF5 + op.write_xdmf = pin->GetOrAddBoolean(op.block_name, "write_xdmf", true); pnew_type = new PHDF5Output(op, restart); #else msg << "### FATAL ERROR in Outputs constructor" << std::endl diff --git a/src/outputs/outputs.hpp b/src/outputs/outputs.hpp index 4fd64236a19c..ff268926d7e1 100644 --- a/src/outputs/outputs.hpp +++ b/src/outputs/outputs.hpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -64,12 +64,13 @@ struct OutputParameters { bool single_precision_output; bool sparse_seed_nans; int hdf5_compression_level; + bool write_xdmf; // TODO(felker): some of the parameters in this class are not initialized in constructor OutputParameters() : block_number(0), next_time(0.0), dt(-1.0), file_number(0), include_ghost_zones(false), cartesian_vector(false), single_precision_output(false), sparse_seed_nans(false), - hdf5_compression_level(5) {} + hdf5_compression_level(5), write_xdmf(false) {} }; //---------------------------------------------------------------------------------------- @@ -208,14 +209,18 @@ class PHDF5Output : public OutputType { private: std::string GenerateFilename_(ParameterInput *pin, SimTime *tm, const SignalHandler::OutputSignal signal); + void WriteBlocksMetadata_(Mesh *pm, hid_t file, const HDF5::H5P &pl, hsize_t offset, + hsize_t max_blocks_global) const; + void WriteCoordinates_(Mesh *pm, const IndexDomain &domain, hid_t file, + const HDF5::H5P &pl, hsize_t offset, + hsize_t max_blocks_global) const; + void WriteLevelsAndLocs_(Mesh *pm, hid_t file, const HDF5::H5P &pl, hsize_t offset, + hsize_t max_blocks_global) const; + void WriteSparseInfo_(Mesh *pm, hbool_t *sparse_allocated, + const std::vector &sparse_names, hsize_t num_sparse, + hid_t file, const HDF5::H5P &pl, size_t offset, + hsize_t max_blocks_global) const; const bool restart_; // true if we write a restart file, false for regular output files - // TODO(JMM): these methods might want to live in the base class or in output_utils.hpp - void ComputeXminBlocks_(Mesh *pm, std::vector &data); - void ComputeLocs_(Mesh *pm, std::vector &locs); - void ComputeIDsAndFlags_(Mesh *pm, std::vector &data); - void ComputeCoords_(Mesh *pm, bool face, const IndexRange &ib, const IndexRange &jb, - const IndexRange &kb, std::vector &x, std::vector &y, - std::vector &z); }; //---------------------------------------------------------------------------------------- diff --git a/src/outputs/parthenon_hdf5.cpp b/src/outputs/parthenon_hdf5.cpp index fbb853a685ab..03c14cc540b9 100644 --- a/src/outputs/parthenon_hdf5.cpp +++ b/src/outputs/parthenon_hdf5.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2020-2023 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -63,7 +63,12 @@ void PHDF5Output::WriteOutputFileImpl(Mesh *pm, ParameterInput *pin, SimTime *tm const SignalHandler::OutputSignal signal) { using namespace HDF5; using namespace OutputUtils; - PARTHENON_INSTRUMENT + + if constexpr (WRITE_SINGLE_PRECISION) { + Kokkos::Profiling::pushRegion("PHDF5::WriteOutputFileSinglePrec"); + } else { + Kokkos::Profiling::pushRegion("PHDF5::WriteOutputFileRealPrec"); + } // writes all graphics variables to hdf file // HDF5 structures @@ -112,93 +117,95 @@ void PHDF5Output::WriteOutputFileImpl(Mesh *pm, ParameterInput *pin, SimTime *tm // -------------------------------------------------------------------------------- // // WRITING ATTRIBUTES // // -------------------------------------------------------------------------------- // - H5G info_group; - { // write attributes - PARTHENON_INSTRUMENT { // Input section - PARTHENON_INSTRUMENT - // write input key-value pairs - std::ostringstream oss; - pin->ParameterDump(oss); - - // Mesh information - const H5G input_group = MakeGroup(file, "/Input"); - - HDF5WriteAttribute("File", oss.str().c_str(), input_group); - } // Input section - - // we'll need this again at the end - info_group = MakeGroup(file, "/Info"); - { // write info - PARTHENON_INSTRUMENT - HDF5WriteAttribute("OutputFormatVersion", OUTPUT_VERSION_FORMAT, info_group); - - if (tm != nullptr) { - HDF5WriteAttribute("NCycle", tm->ncycle, info_group); - HDF5WriteAttribute("Time", tm->time, info_group); - HDF5WriteAttribute("dt", tm->dt, info_group); - } + Kokkos::Profiling::pushRegion("write Attributes"); + { + Kokkos::Profiling::pushRegion("write input"); + // write input key-value pairs + std::ostringstream oss; + pin->ParameterDump(oss); + + // Mesh information + const H5G input_group = MakeGroup(file, "/Input"); + + HDF5WriteAttribute("File", oss.str().c_str(), input_group); + Kokkos::Profiling::popRegion(); // write input + } // Input section + + // we'll need this again at the end + const H5G info_group = MakeGroup(file, "/Info"); + { + Kokkos::Profiling::pushRegion("write Info"); + HDF5WriteAttribute("OutputFormatVersion", OUTPUT_VERSION_FORMAT, info_group); + + if (tm != nullptr) { + HDF5WriteAttribute("NCycle", tm->ncycle, info_group); + HDF5WriteAttribute("Time", tm->time, info_group); + HDF5WriteAttribute("dt", tm->dt, info_group); + } - HDF5WriteAttribute("WallTime", Driver::elapsed_main(), info_group); - HDF5WriteAttribute("NumDims", pm->ndim, info_group); - HDF5WriteAttribute("NumMeshBlocks", pm->nbtotal, info_group); - HDF5WriteAttribute("MaxLevel", max_level, info_group); - // write whether we include ghost cells or not - HDF5WriteAttribute("IncludesGhost", output_params.include_ghost_zones ? 1 : 0, - info_group); - // write number of ghost cells in simulation - HDF5WriteAttribute("NGhost", Globals::nghost, info_group); - HDF5WriteAttribute("Coordinates", std::string(first_block.coords.Name()).c_str(), - info_group); - - // restart info, write always - HDF5WriteAttribute("NBNew", pm->nbnew, info_group); - HDF5WriteAttribute("NBDel", pm->nbdel, info_group); - HDF5WriteAttribute("RootLevel", rootLevel, info_group); - HDF5WriteAttribute("Refine", pm->adaptive ? 1 : 0, info_group); - HDF5WriteAttribute("Multilevel", pm->multilevel ? 1 : 0, info_group); - - HDF5WriteAttribute("BlocksPerPE", nblist, info_group); - - // Mesh block size - HDF5WriteAttribute("MeshBlockSize", std::vector{nx1, nx2, nx3}, info_group); - - // RootGridDomain - float[9] array with xyz mins, maxs, rats (dx(i)/dx(i-1)) - HDF5WriteAttribute( - "RootGridDomain", - std::vector{pm->mesh_size.xmin(X1DIR), pm->mesh_size.xmax(X1DIR), - pm->mesh_size.xrat(X1DIR), pm->mesh_size.xmin(X2DIR), - pm->mesh_size.xmax(X2DIR), pm->mesh_size.xrat(X2DIR), - pm->mesh_size.xmin(X3DIR), pm->mesh_size.xmax(X3DIR), - pm->mesh_size.xrat(X3DIR)}, - info_group); - - // Root grid size (number of cells at root level) - HDF5WriteAttribute("RootGridSize", - std::vector{pm->mesh_size.nx(X1DIR), - pm->mesh_size.nx(X2DIR), - pm->mesh_size.nx(X3DIR)}, - info_group); - - // Boundary conditions - std::vector boundary_condition_str(BOUNDARY_NFACES); - for (size_t i = 0; i < boundary_condition_str.size(); i++) { - boundary_condition_str[i] = GetBoundaryString(pm->mesh_bcs[i]); - } + HDF5WriteAttribute("WallTime", Driver::elapsed_main(), info_group); + HDF5WriteAttribute("NumDims", pm->ndim, info_group); + HDF5WriteAttribute("NumMeshBlocks", pm->nbtotal, info_group); + HDF5WriteAttribute("MaxLevel", max_level, info_group); + // write whether we include ghost cells or not + HDF5WriteAttribute("IncludesGhost", output_params.include_ghost_zones ? 1 : 0, + info_group); + // write number of ghost cells in simulation + HDF5WriteAttribute("NGhost", Globals::nghost, info_group); + HDF5WriteAttribute("Coordinates", std::string(first_block.coords.Name()).c_str(), + info_group); + + // restart info, write always + HDF5WriteAttribute("NBNew", pm->nbnew, info_group); + HDF5WriteAttribute("NBDel", pm->nbdel, info_group); + HDF5WriteAttribute("RootLevel", rootLevel, info_group); + HDF5WriteAttribute("Refine", pm->adaptive ? 1 : 0, info_group); + HDF5WriteAttribute("Multilevel", pm->multilevel ? 1 : 0, info_group); + + HDF5WriteAttribute("BlocksPerPE", nblist, info_group); + + // Mesh block size + HDF5WriteAttribute("MeshBlockSize", std::vector{nx1, nx2, nx3}, info_group); + + // RootGridDomain - float[9] array with xyz mins, maxs, rats (dx(i)/dx(i-1)) + HDF5WriteAttribute( + "RootGridDomain", + std::vector{pm->mesh_size.xmin(X1DIR), pm->mesh_size.xmax(X1DIR), + pm->mesh_size.xrat(X1DIR), pm->mesh_size.xmin(X2DIR), + pm->mesh_size.xmax(X2DIR), pm->mesh_size.xrat(X2DIR), + pm->mesh_size.xmin(X3DIR), pm->mesh_size.xmax(X3DIR), + pm->mesh_size.xrat(X3DIR)}, + info_group); + + // Root grid size (number of cells at root level) + HDF5WriteAttribute("RootGridSize", + std::vector{pm->mesh_size.nx(X1DIR), pm->mesh_size.nx(X2DIR), + pm->mesh_size.nx(X3DIR)}, + info_group); + + // Boundary conditions + std::vector boundary_condition_str(BOUNDARY_NFACES); + for (size_t i = 0; i < boundary_condition_str.size(); i++) { + boundary_condition_str[i] = GetBoundaryString(pm->mesh_bcs[i]); + } - HDF5WriteAttribute("BoundaryConditions", boundary_condition_str, info_group); - } // write info + HDF5WriteAttribute("BoundaryConditions", boundary_condition_str, info_group); + Kokkos::Profiling::popRegion(); // write Info + } // Info section - { // write params - PARTHENON_INSTRUMENT - const H5G params_group = MakeGroup(file, "/Params"); + // write Params + { + Kokkos::Profiling::pushRegion("behold: write Params"); + const H5G params_group = MakeGroup(file, "/Params"); - for (const auto &package : pm->packages.AllPackages()) { - const auto state = package.second; - // Write all params that can be written as HDF5 attributes - state->AllParams().WriteAllToHDF5(state->label(), params_group); - } - } // write params - } // write attributes + for (const auto &package : pm->packages.AllPackages()) { + const auto state = package.second; + // Write all params that can be written as HDF5 attributes + state->AllParams().WriteAllToHDF5(state->label(), params_group); + } + Kokkos::Profiling::popRegion(); // behold: write Params + } // Params section + Kokkos::Profiling::popRegion(); // write Attributes // -------------------------------------------------------------------------------- // // WRITING MESHBLOCK METADATA // @@ -210,19 +217,6 @@ void PHDF5Output::WriteOutputFileImpl(Mesh *pm, ParameterInput *pin, SimTime *tm my_offset += nblist[i]; } - const std::array local_offset({my_offset, 0, 0, 0, 0, 0, 0}); - - // these can vary by data set, except index 0 is always the same - std::array local_count( - {static_cast(num_blocks_local), 1, 1, 1, 1, 1, 1}); - std::array global_count( - {static_cast(max_blocks_global), 1, 1, 1, 1, 1, 1}); - - // for convenience - const hsize_t *const p_loc_offset = local_offset.data(); - const hsize_t *const p_loc_cnt = local_count.data(); - const hsize_t *const p_glob_cnt = global_count.data(); - H5P const pl_xfer = H5P::FromHIDCheck(H5Pcreate(H5P_DATASET_XFER)); H5P const pl_dcreate = H5P::FromHIDCheck(H5Pcreate(H5P_DATASET_CREATE)); @@ -233,331 +227,223 @@ void PHDF5Output::WriteOutputFileImpl(Mesh *pm, ParameterInput *pin, SimTime *tm PARTHENON_HDF5_CHECK(H5Pset_dxpl_mpio(pl_xfer, H5FD_MPIO_COLLECTIVE)); #endif - { // write Blocks metadata - PARTHENON_INSTRUMENT - const H5G gBlocks = MakeGroup(file, "/Blocks"); - - // write Xmin[ndim] for blocks - { - std::vector tmpData(num_blocks_local * 3); - ComputeXminBlocks_(pm, tmpData); - local_count[1] = global_count[1] = pm->ndim; - HDF5Write2D(gBlocks, "xmin", tmpData.data(), p_loc_offset, p_loc_cnt, p_glob_cnt, - pl_xfer); - } - - // write Block ID - { - // LOC.lx1,2,3 - hsize_t n = 3; - std::vector tmpLoc(num_blocks_local * n); - local_count[1] = global_count[1] = n; - ComputeLocs_(pm, tmpLoc); - HDF5Write2D(gBlocks, "loc.lx123", tmpLoc.data(), p_loc_offset, p_loc_cnt, - p_glob_cnt, pl_xfer); - - // (LOC.)level, GID, LID, cnghost, gflag - n = 5; // this is NOT H5_NDIM - std::vector tmpID(num_blocks_local * n); - local_count[1] = global_count[1] = n; - ComputeIDsAndFlags_(pm, tmpID); - HDF5Write2D(gBlocks, "loc.level-gid-lid-cnghost-gflag", tmpID.data(), p_loc_offset, - p_loc_cnt, p_glob_cnt, pl_xfer); - } - } // write Blocks metadata - - // Write mesh coordinates to file - { // write mesh coords - PARTHENON_INSTRUMENT - for (const bool face : {true, false}) { - const H5G gLocations = MakeGroup(file, face ? "/Locations" : "/VolumeLocations"); - - // write X coordinates - std::vector loc_x((nx1 + face) * num_blocks_local); - std::vector loc_y((nx2 + face) * num_blocks_local); - std::vector loc_z((nx3 + face) * num_blocks_local); - - ComputeCoords_(pm, face, out_ib, out_jb, out_kb, loc_x, loc_y, loc_z); - - local_count[1] = global_count[1] = nx1 + face; - HDF5Write2D(gLocations, "x", loc_x.data(), p_loc_offset, p_loc_cnt, p_glob_cnt, - pl_xfer); - - local_count[1] = global_count[1] = nx2 + face; - HDF5Write2D(gLocations, "y", loc_y.data(), p_loc_offset, p_loc_cnt, p_glob_cnt, - pl_xfer); - - local_count[1] = global_count[1] = nx3 + face; - HDF5Write2D(gLocations, "z", loc_z.data(), p_loc_offset, p_loc_cnt, p_glob_cnt, - pl_xfer); - } - } // write mesh coords - - // Write Levels and Logical Locations with the level for each Meshblock loclist contains - // levels and logical locations for all meshblocks on all ranks - { // write levels + log locs - PARTHENON_INSTRUMENT - const auto &loclist = pm->GetLocList(); - - std::vector levels; - levels.reserve(pm->nbtotal); - - std::vector logicalLocations; - logicalLocations.reserve(pm->nbtotal * 3); - - for (const auto &loc : loclist) { - levels.push_back(loc.level() - pm->GetRootLevel()); - logicalLocations.push_back(loc.lx1()); - logicalLocations.push_back(loc.lx2()); - logicalLocations.push_back(loc.lx3()); - } - - // Only write levels on rank 0 since it has data for all ranks - local_count[0] = (Globals::my_rank == 0) ? pm->nbtotal : 0; - HDF5WriteND(file, "Levels", levels.data(), 1, local_offset.data(), local_count.data(), - global_count.data(), pl_xfer, H5P_DEFAULT); - - local_count[1] = global_count[1] = 3; - HDF5Write2D(file, "LogicalLocations", logicalLocations.data(), local_offset.data(), - local_count.data(), global_count.data(), pl_xfer); - - // reset for collective output - local_count[0] = num_blocks_local; - } // write levels + log locs + WriteBlocksMetadata_(pm, file, pl_xfer, my_offset, max_blocks_global); + WriteCoordinates_(pm, theDomain, file, pl_xfer, my_offset, max_blocks_global); + WriteLevelsAndLocs_(pm, file, pl_xfer, my_offset, max_blocks_global); // -------------------------------------------------------------------------------- // // WRITING VARIABLES DATA // // -------------------------------------------------------------------------------- // - std::vector all_vars_info; - std::vector sparse_names; - hsize_t num_sparse; - std::unique_ptr sparse_allocated; - { // write all variable data - PARTHENON_INSTRUMENT - - // All blocks have the same list of variable metadata that exist in the entire - // simulation, but not all variables may be allocated on all blocks - - auto get_vars = [=](const std::shared_ptr pmb) { - auto &var_vec = pmb->meshblock_data.Get()->GetVariableVector(); - if (restart_) { - // get all vars with flag Independent OR restart - return GetAnyVariables( - var_vec, {parthenon::Metadata::Independent, parthenon::Metadata::Restart}); - } else { - return GetAnyVariables(var_vec, output_params.variables); - } - }; - - // get list of all vars, just use the first block since the list is the same for all - // blocks - const auto vars = get_vars(pm->block_list.front()); - for (auto &v : vars) { - all_vars_info.emplace_back(v); + Kokkos::Profiling::pushRegion("write all variable data"); + + // All blocks have the same list of variable metadata that exist in the entire + // simulation, but not all variables may be allocated on all blocks + + auto get_vars = [=](const std::shared_ptr pmb) { + auto &var_vec = pmb->meshblock_data.Get()->GetVariableVector(); + if (restart_) { + // get all vars with flag Independent OR restart + return GetAnyVariables( + var_vec, {parthenon::Metadata::Independent, parthenon::Metadata::Restart}); + } else { + return GetAnyVariables(var_vec, output_params.variables); } + }; - // sort alphabetically - std::sort(all_vars_info.begin(), all_vars_info.end(), - [](const VarInfo &a, const VarInfo &b) { return a.label < b.label; }); - - // We need to add information about the sparse variables to the HDF5 file, namely: - // 1) Which variables are sparse - // 2) Is a sparse id of a particular sparse variable allocated on a given block - // - // This information is stored in the dataset called "SparseInfo". The data set - // contains an attribute "SparseFields" that is a vector of strings with the names - // of the sparse fields (field name with sparse id, i.e. "bar_28", "bar_7", foo_1", - // "foo_145"). The field names are in alphabetical order, which is the same order - // they show up in all_unique_vars (because it's a sorted set). - // - // The dataset SparseInfo itself is a 2D array of bools. The first index is the - // global block index and the second index is the sparse field (same order as the - // SparseFields attribute). SparseInfo[b][v] is true if the sparse field with index - // v is allocated on the block with index b, otherwise the value is false - - std::unordered_map sparse_field_idx; - for (auto &vinfo : all_vars_info) { - if (vinfo.is_sparse) { - sparse_field_idx.insert({vinfo.label, sparse_names.size()}); - sparse_names.push_back(vinfo.label); - } - } - - num_sparse = sparse_names.size(); - // can't use std::vector here because std::vector is the same as - // std::vector and it doesn't have .data() member - sparse_allocated = std::make_unique(num_blocks_local * num_sparse); - - // allocate space for largest size variable - int varSize_max = 0; - for (auto &vinfo : all_vars_info) { - const int varSize = - vinfo.nx6 * vinfo.nx5 * vinfo.nx4 * vinfo.nx3 * vinfo.nx2 * vinfo.nx1; - varSize_max = std::max(varSize_max, varSize); - } - - using OutT = typename std::conditional::type; - std::vector tmpData(varSize_max * num_blocks_local); - - // create persistent spaces - local_count[0] = num_blocks_local; - global_count[0] = max_blocks_global; - local_count[4] = global_count[4] = nx3; - local_count[5] = global_count[5] = nx2; - local_count[6] = global_count[6] = nx1; + // get list of all vars, just use the first block since the list is the same for all + // blocks + std::vector all_vars_info; + const auto vars = get_vars(pm->block_list.front()); + for (auto &v : vars) { + all_vars_info.emplace_back(v); + } - // for each variable we write - for (auto &vinfo : all_vars_info) { - PARTHENON_INSTRUMENT - // not really necessary, but doesn't hurt - memset(tmpData.data(), 0, tmpData.size() * sizeof(OutT)); + // sort alphabetically + std::sort(all_vars_info.begin(), all_vars_info.end(), + [](const VarInfo &a, const VarInfo &b) { return a.label < b.label; }); + + // We need to add information about the sparse variables to the HDF5 file, namely: + // 1) Which variables are sparse + // 2) Is a sparse id of a particular sparse variable allocated on a given block + // + // This information is stored in the dataset called "SparseInfo". The data set + // contains an attribute "SparseFields" that is a vector of strings with the names + // of the sparse fields (field name with sparse id, i.e. "bar_28", "bar_7", foo_1", + // "foo_145"). The field names are in alphabetical order, which is the same order + // they show up in all_unique_vars (because it's a sorted set). + // + // The dataset SparseInfo itself is a 2D array of bools. The first index is the + // global block index and the second index is the sparse field (same order as the + // SparseFields attribute). SparseInfo[b][v] is true if the sparse field with index + // v is allocated on the block with index b, otherwise the value is false - const std::string var_name = vinfo.label; - const hsize_t nx6 = vinfo.nx6; - const hsize_t nx5 = vinfo.nx5; - const hsize_t nx4 = vinfo.nx4; + std::vector sparse_names; + std::unordered_map sparse_field_idx; + for (auto &vinfo : all_vars_info) { + if (vinfo.is_sparse) { + sparse_field_idx.insert({vinfo.label, sparse_names.size()}); + sparse_names.push_back(vinfo.label); + } + } - local_count[1] = global_count[1] = nx6; - local_count[2] = global_count[2] = nx5; - local_count[3] = global_count[3] = nx4; + hsize_t num_sparse = sparse_names.size(); + // can't use std::vector here because std::vector is the same as + // std::vector and it doesn't have .data() member + std::unique_ptr sparse_allocated(new hbool_t[num_blocks_local * num_sparse]); - std::vector alldims({nx6, nx5, nx4, static_cast(vinfo.nx3), - static_cast(vinfo.nx2), - static_cast(vinfo.nx1)}); + // allocate space for largest size variable + int varSize_max = 0; + for (auto &vinfo : all_vars_info) { + const int varSize = vinfo.Size(); + varSize_max = std::max(varSize_max, varSize); + } - int ndim = -1; + using OutT = typename std::conditional::type; + std::vector tmpData(varSize_max * num_blocks_local); + + // for each variable we write + for (auto &vinfo : all_vars_info) { + Kokkos::Profiling::pushRegion("write variable loop"); + // not really necessary, but doesn't hurt + memset(tmpData.data(), 0, tmpData.size() * sizeof(OutT)); + + const std::string var_name = vinfo.label; + const hsize_t nx6 = vinfo.nx6; + const hsize_t nx5 = vinfo.nx5; + const hsize_t nx4 = vinfo.nx4; + + hsize_t local_offset[H5_NDIM] = {my_offset, 0, 0, 0, 0, 0, 0}; + hsize_t local_count[H5_NDIM] = {static_cast(num_blocks_local), + static_cast(nx6), + static_cast(nx5), + static_cast(nx4), + static_cast(nx3), + static_cast(nx2), + static_cast(nx1)}; + hsize_t global_count[H5_NDIM] = {static_cast(max_blocks_global), + static_cast(nx6), + static_cast(nx5), + static_cast(nx4), + static_cast(nx3), + static_cast(nx2), + static_cast(nx1)}; + + std::vector alldims({nx6, nx5, nx4, static_cast(vinfo.nx3), + static_cast(vinfo.nx2), + static_cast(vinfo.nx1)}); + + int ndim = -1; #ifndef PARTHENON_DISABLE_HDF5_COMPRESSION - // we need chunks to enable compression - std::array chunk_size({1, 1, 1, 1, 1, 1, 1}); + // we need chunks to enable compression + std::array chunk_size({1, 1, 1, 1, 1, 1, 1}); #endif - if (vinfo.where == MetadataFlag(Metadata::Cell)) { - ndim = 3 + vinfo.tensor_rank + 1; - for (int i = 0; i < vinfo.tensor_rank; i++) { - local_count[1 + i] = global_count[1 + i] = alldims[3 - vinfo.tensor_rank + i]; - } - local_count[vinfo.tensor_rank + 1] = global_count[vinfo.tensor_rank + 1] = nx3; - local_count[vinfo.tensor_rank + 2] = global_count[vinfo.tensor_rank + 2] = nx2; - local_count[vinfo.tensor_rank + 3] = global_count[vinfo.tensor_rank + 3] = nx1; + if (vinfo.where == MetadataFlag(Metadata::Cell)) { + ndim = 3 + vinfo.tensor_rank + 1; + for (int i = 0; i < vinfo.tensor_rank; i++) { + local_count[1 + i] = global_count[1 + i] = alldims[3 - vinfo.tensor_rank + i]; + } + local_count[vinfo.tensor_rank + 1] = global_count[vinfo.tensor_rank + 1] = nx3; + local_count[vinfo.tensor_rank + 2] = global_count[vinfo.tensor_rank + 2] = nx2; + local_count[vinfo.tensor_rank + 3] = global_count[vinfo.tensor_rank + 3] = nx1; #ifndef PARTHENON_DISABLE_HDF5_COMPRESSION - if (output_params.hdf5_compression_level > 0) { - for (int i = ndim - 3; i < ndim; i++) { - chunk_size[i] = local_count[i]; - } + if (output_params.hdf5_compression_level > 0) { + for (int i = ndim - 3; i < ndim; i++) { + chunk_size[i] = local_count[i]; } + } #endif - } else if (vinfo.where == MetadataFlag(Metadata::None)) { - ndim = vinfo.tensor_rank + 1; - for (int i = 0; i < vinfo.tensor_rank; i++) { - local_count[1 + i] = global_count[1 + i] = alldims[6 - vinfo.tensor_rank + i]; - } + } else if (vinfo.where == MetadataFlag(Metadata::None)) { + ndim = vinfo.tensor_rank + 1; + for (int i = 0; i < vinfo.tensor_rank; i++) { + local_count[1 + i] = global_count[1 + i] = alldims[6 - vinfo.tensor_rank + i]; + } #ifndef PARTHENON_DISABLE_HDF5_COMPRESSION - if (output_params.hdf5_compression_level > 0) { - int nchunk_indices = std::min(vinfo.tensor_rank, 3); - for (int i = ndim - nchunk_indices; i < ndim; i++) { - chunk_size[i] = alldims[6 - nchunk_indices + i]; - } + if (output_params.hdf5_compression_level > 0) { + int nchunk_indices = std::min(vinfo.tensor_rank, 3); + for (int i = ndim - nchunk_indices; i < ndim; i++) { + chunk_size[i] = alldims[6 - nchunk_indices + i]; } -#endif - } else { - PARTHENON_THROW("Only Cell and None locations supported!"); } +#endif + } else { + PARTHENON_THROW("Only Cell and None locations supported!"); + } #ifndef PARTHENON_DISABLE_HDF5_COMPRESSION - PARTHENON_HDF5_CHECK(H5Pset_chunk(pl_dcreate, ndim, chunk_size.data())); - // Do not run the pipeline if compression is soft disabled. - // By default data would still be passed, which may result in slower output. - if (output_params.hdf5_compression_level > 0) { - PARTHENON_HDF5_CHECK(H5Pset_deflate( - pl_dcreate, std::min(9, output_params.hdf5_compression_level))); - } + PARTHENON_HDF5_CHECK(H5Pset_chunk(pl_dcreate, ndim, chunk_size.data())); + // Do not run the pipeline if compression is soft disabled. + // By default data would still be passed, which may result in slower output. + if (output_params.hdf5_compression_level > 0) { + PARTHENON_HDF5_CHECK( + H5Pset_deflate(pl_dcreate, std::min(9, output_params.hdf5_compression_level))); + } #endif - // load up data - hsize_t index = 0; - - { // fill host output buffer - PARTHENON_INSTRUMENT - // for each local mesh block - for (size_t b_idx = 0; b_idx < num_blocks_local; ++b_idx) { - const auto &pmb = pm->block_list[b_idx]; - bool is_allocated = false; - - // for each variable that this local meshblock actually has - const auto vars = get_vars(pmb); - for (auto &v : vars) { - // For reference, if we update the logic here, there's also - // a similar block in parthenon_manager.cpp - if (v->IsAllocated() && (var_name == v->label())) { - auto v_h = v->data.GetHostMirrorAndCopy(); - for (int t = 0; t < nx6; ++t) { - for (int u = 0; u < nx5; ++u) { - for (int v = 0; v < nx4; ++v) { - if (vinfo.where == MetadataFlag(Metadata::Cell)) { - for (int k = out_kb.s; k <= out_kb.e; ++k) { - for (int j = out_jb.s; j <= out_jb.e; ++j) { - for (int i = out_ib.s; i <= out_ib.e; ++i) { - tmpData[index++] = static_cast(v_h(t, u, v, k, j, i)); - } - } - } - } else { - for (int k = 0; k < vinfo.nx3; ++k) { - for (int j = 0; j < vinfo.nx2; ++j) { - for (int i = 0; i < vinfo.nx1; ++i) { - tmpData[index++] = static_cast(v_h(t, u, v, k, j, i)); - } - } - } - } - } - } - } - - is_allocated = true; - break; - } - } + // load up data + hsize_t index = 0; + + Kokkos::Profiling::pushRegion("fill host output buffer"); + // for each local mesh block + for (size_t b_idx = 0; b_idx < num_blocks_local; ++b_idx) { + const auto &pmb = pm->block_list[b_idx]; + bool is_allocated = false; + + // for each variable that this local meshblock actually has + const auto vars = get_vars(pmb); + for (auto &v : vars) { + // For reference, if we update the logic here, there's also + // a similar block in parthenon_manager.cpp + if (v->IsAllocated() && (var_name == v->label())) { + auto v_h = v->data.GetHostMirrorAndCopy(); + OutputUtils::PackOrUnpackVar( + pmb.get(), v.get(), output_params.include_ghost_zones, index, tmpData, + [&](auto index, int t, int u, int v, int k, int j, int i) { + tmpData[index] = static_cast(v_h(t, u, v, k, j, i)); + }); + + is_allocated = true; + break; + } + } - if (vinfo.is_sparse) { - size_t sparse_idx = sparse_field_idx.at(vinfo.label); - sparse_allocated[b_idx * num_sparse + sparse_idx] = is_allocated; - } + if (vinfo.is_sparse) { + size_t sparse_idx = sparse_field_idx.at(vinfo.label); + sparse_allocated[b_idx * num_sparse + sparse_idx] = is_allocated; + } - if (!is_allocated) { - if (vinfo.is_sparse) { - hsize_t varSize{}; - if (vinfo.where == MetadataFlag(Metadata::Cell)) { - varSize = vinfo.nx6 * vinfo.nx5 * vinfo.nx4 * (out_kb.e - out_kb.s + 1) * - (out_jb.e - out_jb.s + 1) * (out_ib.e - out_ib.s + 1); - } else { - varSize = - vinfo.nx6 * vinfo.nx5 * vinfo.nx4 * vinfo.nx3 * vinfo.nx2 * vinfo.nx1; - } - auto fill_val = output_params.sparse_seed_nans - ? std::numeric_limits::quiet_NaN() - : 0; - std::fill(tmpData.data() + index, tmpData.data() + index + varSize, - fill_val); - index += varSize; - } else { - std::stringstream msg; - msg << "### ERROR: Unable to find dense variable " << var_name << std::endl; - PARTHENON_FAIL(msg); - } + if (!is_allocated) { + if (vinfo.is_sparse) { + hsize_t varSize{}; + if (vinfo.where == MetadataFlag(Metadata::Cell)) { + varSize = vinfo.nx6 * vinfo.nx5 * vinfo.nx4 * (out_kb.e - out_kb.s + 1) * + (out_jb.e - out_jb.s + 1) * (out_ib.e - out_ib.s + 1); + } else { + varSize = + vinfo.nx6 * vinfo.nx5 * vinfo.nx4 * vinfo.nx3 * vinfo.nx2 * vinfo.nx1; } + auto fill_val = + output_params.sparse_seed_nans ? std::numeric_limits::quiet_NaN() : 0; + std::fill(tmpData.data() + index, tmpData.data() + index + varSize, fill_val); + index += varSize; + } else { + std::stringstream msg; + msg << "### ERROR: Unable to find dense variable " << var_name << std::endl; + PARTHENON_FAIL(msg); } - } // fill host output buffer - - { // write variable data - PARTHENON_INSTRUMENT - // write data to file - HDF5WriteND(file, var_name, tmpData.data(), ndim, p_loc_offset, p_loc_cnt, - p_glob_cnt, pl_xfer, pl_dcreate); - } // write variable data - } // for each variable - } // write all variable data + } + } + Kokkos::Profiling::popRegion(); // fill host output buffer + + Kokkos::Profiling::pushRegion("write variable data"); + // write data to file + HDF5WriteND(file, var_name, tmpData.data(), ndim, &local_offset[0], &local_count[0], + &global_count[0], pl_xfer, pl_dcreate); + Kokkos::Profiling::popRegion(); // write variable data + Kokkos::Profiling::popRegion(); // write variable loop + } + Kokkos::Profiling::popRegion(); // write all variable data // names of variables std::vector var_names; @@ -590,95 +476,91 @@ void PHDF5Output::WriteOutputFileImpl(Mesh *pm, ParameterInput *pin, SimTime *tm // write SparseInfo and SparseFields (we can't write a zero-size dataset, so only write // this if we have sparse fields) if (num_sparse > 0) { - PARTHENON_INSTRUMENT - local_count[1] = global_count[1] = num_sparse; - - HDF5Write2D(file, "SparseInfo", sparse_allocated.get(), p_loc_offset, p_loc_cnt, - p_glob_cnt, pl_xfer); - - // write names of sparse fields as attribute, first convert to vector of const char* - std::vector names(num_sparse); - for (size_t i = 0; i < num_sparse; ++i) - names[i] = sparse_names[i].c_str(); - - const H5D dset = H5D::FromHIDCheck(H5Dopen2(file, "SparseInfo", H5P_DEFAULT)); - HDF5WriteAttribute("SparseFields", names, dset); + WriteSparseInfo_(pm, sparse_allocated.get(), sparse_names, num_sparse, file, pl_xfer, + my_offset, max_blocks_global); } // SparseInfo and SparseFields sections // -------------------------------------------------------------------------------- // // WRITING PARTICLE DATA // // -------------------------------------------------------------------------------- // + Kokkos::Profiling::pushRegion("write particle data"); AllSwarmInfo swarm_info(pm->block_list, output_params.swarms, restart_); - { // write particle data - PARTHENON_INSTRUMENT - for (auto &[swname, swinfo] : swarm_info.all_info) { - const H5G g_swm = MakeGroup(file, swname); - // offsets/counts are NOT the same here vs the grid data - hsize_t local_offset[6] = {static_cast(my_offset), 0, 0, 0, 0, 0}; - hsize_t local_count[6] = {static_cast(num_blocks_local), 0, 0, 0, 0, 0}; - hsize_t global_count[6] = {static_cast(max_blocks_global), 0, 0, 0, 0, 0}; - // These indicate particles/meshblock and location in global index - // space where each meshblock starts - HDF5Write1D(g_swm, "counts", swinfo.counts.data(), local_offset, local_count, - global_count, pl_xfer); - HDF5Write1D(g_swm, "offsets", swinfo.offsets.data(), local_offset, local_count, - global_count, pl_xfer); - - const H5G g_var = MakeGroup(g_swm, "SwarmVars"); - if (swinfo.global_count == 0) { - continue; - } - auto SetCounts = [&](const SwarmInfo &swinfo, const SwarmVarInfo &vinfo) { - const int rank = vinfo.tensor_rank; - for (int i = 0; i < 6; ++i) { - local_offset[i] = 0; // reset everything - local_count[i] = 0; - global_count[i] = 0; - } - for (int i = 0; i < rank; ++i) { - local_count[i] = global_count[i] = vinfo.GetN(rank + 1 - i); - } - local_offset[rank] = swinfo.global_offset; - local_count[rank] = swinfo.count_on_rank; - global_count[rank] = swinfo.global_count; - }; - auto &int_vars = std::get>(swinfo.vars); - for (auto &[vname, swmvarvec] : int_vars) { - const auto &vinfo = swinfo.var_info.at(vname); - auto host_data = swinfo.FillHostBuffer(vname, swmvarvec); - SetCounts(swinfo, vinfo); - HDF5WriteND(g_var, vname, host_data.data(), vinfo.tensor_rank + 1, local_offset, - local_count, global_count, pl_xfer, H5P_DEFAULT); - } - auto &rvars = std::get>(swinfo.vars); - for (auto &[vname, swmvarvec] : rvars) { - const auto &vinfo = swinfo.var_info.at(vname); - auto host_data = swinfo.FillHostBuffer(vname, swmvarvec); - SetCounts(swinfo, vinfo); - HDF5WriteND(g_var, vname, host_data.data(), vinfo.tensor_rank + 1, local_offset, - local_count, global_count, pl_xfer, H5P_DEFAULT); + for (auto &[swname, swinfo] : swarm_info.all_info) { + const H5G g_swm = MakeGroup(file, swname); + // offsets/counts are NOT the same here vs the grid data + hsize_t local_offset[6] = {static_cast(my_offset), 0, 0, 0, 0, 0}; + hsize_t local_count[6] = {static_cast(num_blocks_local), 0, 0, 0, 0, 0}; + hsize_t global_count[6] = {static_cast(max_blocks_global), 0, 0, 0, 0, 0}; + // These indicate particles/meshblock and location in global index + // space where each meshblock starts + HDF5Write1D(g_swm, "counts", swinfo.counts.data(), local_offset, local_count, + global_count, pl_xfer); + HDF5Write1D(g_swm, "offsets", swinfo.offsets.data(), local_offset, local_count, + global_count, pl_xfer); + + const H5G g_var = MakeGroup(g_swm, "SwarmVars"); + if (swinfo.global_count == 0) { + continue; + } + auto SetCounts = [&](const SwarmInfo &swinfo, const SwarmVarInfo &vinfo) { + const int rank = vinfo.tensor_rank; + for (int i = 0; i < 6; ++i) { + local_offset[i] = 0; // reset everything + local_count[i] = 0; + global_count[i] = 0; } - // If swarm does not contain an "id" object, generate a sequential - // one for vis. - if (swinfo.var_info.count("id") == 0) { - std::vector ids(swinfo.global_count); - std::iota(std::begin(ids), std::end(ids), swinfo.global_offset); - local_offset[0] = swinfo.global_offset; - local_count[0] = swinfo.count_on_rank; - global_count[0] = swinfo.global_count; - HDF5Write1D(g_var, "id", ids.data(), local_offset, local_count, global_count, - pl_xfer); + for (int i = 0; i < rank; ++i) { + local_count[i] = global_count[i] = vinfo.GetN(rank + 1 - i); } + local_offset[rank] = swinfo.global_offset; + local_count[rank] = swinfo.count_on_rank; + global_count[rank] = swinfo.global_count; + }; + auto &int_vars = std::get>(swinfo.vars); + for (auto &[vname, swmvarvec] : int_vars) { + const auto &vinfo = swinfo.var_info.at(vname); + auto host_data = swinfo.FillHostBuffer(vname, swmvarvec); + SetCounts(swinfo, vinfo); + HDF5WriteND(g_var, vname, host_data.data(), vinfo.tensor_rank + 1, local_offset, + local_count, global_count, pl_xfer, H5P_DEFAULT); + } + auto &rvars = std::get>(swinfo.vars); + for (auto &[vname, swmvarvec] : rvars) { + const auto &vinfo = swinfo.var_info.at(vname); + auto host_data = swinfo.FillHostBuffer(vname, swmvarvec); + SetCounts(swinfo, vinfo); + HDF5WriteND(g_var, vname, host_data.data(), vinfo.tensor_rank + 1, local_offset, + local_count, global_count, pl_xfer, H5P_DEFAULT); } - } // write particle data + // If swarm does not contain an "id" object, generate a sequential + // one for vis. + if (swinfo.var_info.count("id") == 0) { + std::vector ids(swinfo.global_count); + std::iota(std::begin(ids), std::end(ids), swinfo.global_offset); + local_offset[0] = swinfo.global_offset; + local_count[0] = swinfo.count_on_rank; + global_count[0] = swinfo.global_count; + HDF5Write1D(g_var, "id", ids.data(), local_offset, local_count, global_count, + pl_xfer); + } + } + Kokkos::Profiling::popRegion(); // write particle data - { // generate xdmf - PARTHENON_INSTRUMENT + if (output_params.write_xdmf) { + Kokkos::Profiling::pushRegion("genXDMF"); // generate XDMF companion file XDMF::genXDMF(filename, pm, tm, nx1, nx2, nx3, all_vars_info, swarm_info); - } // generate xdmf + Kokkos::Profiling::popRegion(); // genXDMF + } + + Kokkos::Profiling::popRegion(); // WriteOutputFile???Prec } +// explicit template instantiation +template void PHDF5Output::WriteOutputFileImpl(Mesh *, ParameterInput *, SimTime *, + SignalHandler::OutputSignal); +template void PHDF5Output::WriteOutputFileImpl(Mesh *, ParameterInput *, SimTime *, + SignalHandler::OutputSignal); std::string PHDF5Output::GenerateFilename_(ParameterInput *pin, SimTime *tm, const SignalHandler::OutputSignal signal) { @@ -714,66 +596,128 @@ std::string PHDF5Output::GenerateFilename_(ParameterInput *pin, SimTime *tm, } return filename; } -// TODO(JMM): Should this live in the base class or output_utils? -void PHDF5Output::ComputeXminBlocks_(Mesh *pm, std::vector &data) { - int i = 0; - for (auto &pmb : pm->block_list) { - auto xmin = pmb->coords.GetXmin(); - data[i++] = xmin[0]; - if (pm->ndim > 1) { - data[i++] = xmin[1]; - } - if (pm->ndim > 2) { - data[i++] = xmin[2]; - } + +void PHDF5Output::WriteBlocksMetadata_(Mesh *pm, hid_t file, const HDF5::H5P &pl, + hsize_t offset, hsize_t max_blocks_global) const { + using namespace HDF5; + Kokkos::Profiling::pushRegion("I/O HDF5: write block metadata"); + const H5G gBlocks = MakeGroup(file, "/Blocks"); + const hsize_t num_blocks_local = pm->block_list.size(); + const hsize_t ndim = pm->ndim; + const hsize_t loc_offset[2] = {offset, 0}; + + // write Xmin[ndim] for blocks + { + // JMM: These arrays chould be shared, but I think this is clearer + // as to what's going on. + hsize_t loc_cnt[2] = {num_blocks_local, ndim}; + hsize_t glob_cnt[2] = {max_blocks_global, ndim}; + + std::vector tmpData = OutputUtils::ComputeXminBlocks(pm); + HDF5Write2D(gBlocks, "xmin", tmpData.data(), &loc_offset[0], &loc_cnt[0], + &glob_cnt[0], pl); } -} -// TODO(JMM): Should this live in the base class or output_utils? -void PHDF5Output::ComputeLocs_(Mesh *pm, std::vector &locs) { - int i = 0; - for (auto &pmb : pm->block_list) { - locs[i++] = pmb->loc.lx1(); - locs[i++] = pmb->loc.lx2(); - locs[i++] = pmb->loc.lx3(); + + { + // LOC.lx1,2,3 + hsize_t loc_cnt[2] = {num_blocks_local, 3}; + hsize_t glob_cnt[2] = {max_blocks_global, 3}; + std::vector tmpLoc = OutputUtils::ComputeLocs(pm); + HDF5Write2D(gBlocks, "loc.lx123", tmpLoc.data(), &loc_offset[0], &loc_cnt[0], + &glob_cnt[0], pl); } -} -// TODO(JMM): Should this live in the base class or output_utils? -void PHDF5Output::ComputeIDsAndFlags_(Mesh *pm, std::vector &data) { - int i = 0; - for (auto &pmb : pm->block_list) { - data[i++] = pmb->loc.level(); - data[i++] = pmb->gid; - data[i++] = pmb->lid; - data[i++] = pmb->cnghost; - data[i++] = pmb->gflag; + + { + // (LOC.)level, GID, LID, cnghost, gflag + hsize_t loc_cnt[2] = {num_blocks_local, 5}; + hsize_t glob_cnt[2] = {max_blocks_global, 5}; + std::vector tmpID = OutputUtils::ComputeIDsAndFlags(pm); + HDF5Write2D(gBlocks, "loc.level-gid-lid-cnghost-gflag", tmpID.data(), &loc_offset[0], + &loc_cnt[0], &glob_cnt[0], pl); } + Kokkos::Profiling::popRegion(); // write block metadata } -// TODO(JMM): Should this live in the base class or output_utils? -void PHDF5Output::ComputeCoords_(Mesh *pm, bool face, const IndexRange &ib, - const IndexRange &jb, const IndexRange &kb, - std::vector &x, std::vector &y, - std::vector &z) { - std::size_t idx_x = 0, idx_y = 0, idx_z = 0; - - // note relies on casting of bool to int - for (auto &pmb : pm->block_list) { - for (int i = ib.s; i <= ib.e + face; ++i) { - x[idx_x++] = face ? pmb->coords.Xf<1>(i) : pmb->coords.Xc<1>(i); - } - for (int j = jb.s; j <= jb.e + face; ++j) { - y[idx_y++] = face ? pmb->coords.Xf<2>(j) : pmb->coords.Xc<2>(j); - } - for (int k = kb.s; k <= kb.e + face; ++k) { - z[idx_z++] = face ? pmb->coords.Xf<3>(k) : pmb->coords.Xc<3>(k); - } + +void PHDF5Output::WriteCoordinates_(Mesh *pm, const IndexDomain &domain, hid_t file, + const HDF5::H5P &pl, hsize_t offset, + hsize_t max_blocks_global) const { + using namespace HDF5; + Kokkos::Profiling::pushRegion("write mesh coords"); + const IndexShape &shape = pm->GetLeafBlockCellBounds(); + const IndexRange ib = shape.GetBoundsI(domain); + const IndexRange jb = shape.GetBoundsJ(domain); + const IndexRange kb = shape.GetBoundsK(domain); + + const hsize_t num_blocks_local = pm->block_list.size(); + const hsize_t loc_offset[2] = {offset, 0}; + hsize_t loc_cnt[2] = {num_blocks_local, 1}; + hsize_t glob_cnt[2] = {max_blocks_global, 1}; + + for (const bool face : {true, false}) { + const H5G gLocations = MakeGroup(file, face ? "/Locations" : "/VolumeLocations"); + + std::vector loc_x, loc_y, loc_z; + OutputUtils::ComputeCoords(pm, face, ib, jb, kb, loc_x, loc_y, loc_z); + + loc_cnt[1] = glob_cnt[1] = (ib.e - ib.s + 1) + face; + HDF5Write2D(gLocations, "x", loc_x.data(), &loc_offset[0], &loc_cnt[0], &glob_cnt[0], + pl); + + loc_cnt[1] = glob_cnt[1] = (jb.e - jb.s + 1) + face; + HDF5Write2D(gLocations, "y", loc_y.data(), &loc_offset[0], &loc_cnt[0], &glob_cnt[0], + pl); + + loc_cnt[1] = glob_cnt[1] = (kb.e - kb.s + 1) + face; + HDF5Write2D(gLocations, "z", loc_z.data(), &loc_offset[0], &loc_cnt[0], &glob_cnt[0], + pl); } + Kokkos::Profiling::popRegion(); // write mesh coords } -// explicit template instantiation -template void PHDF5Output::WriteOutputFileImpl(Mesh *, ParameterInput *, SimTime *, - SignalHandler::OutputSignal); -template void PHDF5Output::WriteOutputFileImpl(Mesh *, ParameterInput *, SimTime *, - SignalHandler::OutputSignal); +void PHDF5Output::WriteLevelsAndLocs_(Mesh *pm, hid_t file, const HDF5::H5P &pl, + hsize_t offset, hsize_t max_blocks_global) const { + using namespace HDF5; + Kokkos::Profiling::pushRegion("write levels and locations"); + auto [levels, logicalLocations] = pm->GetLevelsAndLogicalLocationsFlat(); + + // Only write levels on rank 0 since it has data for all ranks + const hsize_t num_blocks_local = pm->block_list.size(); + const hsize_t loc_offset[2] = {offset, 0}; + const hsize_t loc_cnt[2] = {(Globals::my_rank == 0) ? max_blocks_global : 0, 3}; + const hsize_t glob_cnt[2] = {max_blocks_global, 3}; + + HDF5Write1D(file, "Levels", levels.data(), &loc_offset[0], &loc_cnt[0], &glob_cnt[0], + pl); + HDF5Write2D(file, "LogicalLocations", logicalLocations.data(), &loc_offset[0], + &loc_cnt[0], &glob_cnt[0], pl); + + Kokkos::Profiling::popRegion(); // write levels and locations +} + +void PHDF5Output::WriteSparseInfo_(Mesh *pm, hbool_t *sparse_allocated, + const std::vector &sparse_names, + hsize_t num_sparse, hid_t file, const HDF5::H5P &pl, + size_t offset, hsize_t max_blocks_global) const { + using namespace HDF5; + Kokkos::Profiling::pushRegion("write sparse info"); + + const hsize_t num_blocks_local = pm->block_list.size(); + const hsize_t loc_offset[2] = {offset, 0}; + const hsize_t loc_cnt[2] = {num_blocks_local, num_sparse}; + const hsize_t glob_cnt[2] = {max_blocks_global, num_sparse}; + + HDF5Write2D(file, "SparseInfo", sparse_allocated, &loc_offset[0], &loc_cnt[0], + &glob_cnt[0], pl); + + // write names of sparse fields as attribute, first convert to vector of const char* + std::vector names(num_sparse); + for (size_t i = 0; i < num_sparse; ++i) + names[i] = sparse_names[i].c_str(); + + const H5D dset = H5D::FromHIDCheck(H5Dopen2(file, "SparseInfo", H5P_DEFAULT)); + HDF5WriteAttribute("SparseFields", names, dset); + Kokkos::Profiling::popRegion(); // write sparse info +} // Utility functions implemented namespace HDF5 { diff --git a/src/parthenon_manager.cpp b/src/parthenon_manager.cpp index 2329553bfe76..6815adfbd2b5 100644 --- a/src/parthenon_manager.cpp +++ b/src/parthenon_manager.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2020-2023 The Parthenon collaboration // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2023. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -337,6 +337,8 @@ void ParthenonManager::RestartPackages(Mesh &rm, RestartReader &resfile) { // Double note that this also needs to be update in case // we update the HDF5 infrastructure! if (file_output_format_ver == -1) { + PARTHENON_WARN("This file output format version is deprecrated and will be " + "removed in a future release."); for (int k = out_kb.s; k <= out_kb.e; ++k) { for (int j = out_jb.s; j <= out_jb.e; ++j) { for (int i = out_ib.s; i <= out_ib.e; ++i) { @@ -348,19 +350,9 @@ void ParthenonManager::RestartPackages(Mesh &rm, RestartReader &resfile) { } } else if (file_output_format_ver == 2 || file_output_format_ver == HDF5::OUTPUT_VERSION_FORMAT) { - for (int t = 0; t < Nt; ++t) { - for (int u = 0; u < Nu; ++u) { - for (int v = 0; v < Nv; ++v) { - for (int k = out_kb.s; k <= out_kb.e; ++k) { - for (int j = out_jb.s; j <= out_jb.e; ++j) { - for (int i = out_ib.s; i <= out_ib.e; ++i) { - v_h(t, u, v, k, j, i) = tmp[index++]; - } - } - } - } - } - } + OutputUtils::PackOrUnpackVar(pmb.get(), v.get(), resfile.hasGhost, index, tmp, + [&](auto index, int t, int u, int v, int k, int j, + int i) { v_h(t, u, v, k, j, i) = tmp[index]; }); } else { PARTHENON_THROW("Unknown output format version in restart file.") } diff --git a/src/pgen/default_pgen.cpp b/src/pgen/default_pgen.cpp index fc745dd008d8..d3c0ed50d164 100644 --- a/src/pgen/default_pgen.cpp +++ b/src/pgen/default_pgen.cpp @@ -3,7 +3,7 @@ // Copyright(C) 2014 James M. Stone and other code contributors // Licensed under the 3-clause BSD License, see LICENSE file for details //======================================================================================== -// (C) (or copyright) 2020-2022. Triad National Security, LLC. All rights reserved. +// (C) (or copyright) 2020-2024. Triad National Security, LLC. All rights reserved. // // This program was produced under U.S. Government contract 89233218CNA000001 for Los // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC @@ -110,7 +110,7 @@ void MeshBlock::InitMeshBlockUserDataDefault(MeshBlock *pmb, ParameterInput *pin } //======================================================================================== -//! \fn void MeshBlock::ProblemGeneratorDefault(ParameterInput *pin) +//! \fn void MeshBlock::ProblemGeneratorDefault(MeshBlock *pmb, ParameterInput *pin) // \brief Should be used to set initial conditions. //======================================================================================== @@ -120,6 +120,13 @@ void MeshBlock::ProblemGeneratorDefault(MeshBlock *pmb, ParameterInput *pin) { return; } +//======================================================================================== +//! \fn void MeshBlock::PostInitializationDefault(MeshBlock *pmb, ParameterInput *pin) +// \brief Should be used to perform post initialization ops. +//======================================================================================== + +void MeshBlock::PostInitializationDefault(MeshBlock *pmb, ParameterInput *pin) { return; } + //======================================================================================== //! \fn void MeshBlock::UserWorkBeforeOutputDefault(MeshBlock *pmb, ParameterInput *pin) // \brief Function called before generating output files