Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ex_close fails if file is mode is EX_READ or EX_CLOBBER in parallel #467

Open
bourdin opened this issue Jul 3, 2024 · 8 comments
Open

Comments

@bourdin
Copy link
Contributor

bourdin commented Jul 3, 2024

This sequence (trivial modification of testrd_par.c):

exoid = ex_open_par("test.exo",        /* filename path */
                          EX_READ,           /* access mode = READ */
                          &CPU_word_size, /* CPU word size */
                          &IO_word_size,    /* IO word size */
                          &version,       /* ExodusII library version */
                          mpi_comm, mpi_info);
  error = ex_close(exoid);

produces the following error:

Exodus Library Warning/Error: [ex_close] in file 'test.exo'
        ERROR: failed to close file id 65536
        NetCDF: Write to read only

If the file is open with ex_open, the code runs fine

@gsjaardema
Copy link
Member

I am able to run this with no errors on 1, 2, 4, 8 ranks. Can you provide more informataion as to what version you are using, how you are compiling/running...

@bourdin
Copy link
Contributor Author

bourdin commented Jul 8, 2024

Strange. I get this error for any processor count. The exodus libraries are compiled by PETSc My initial test was with v2022-08-01, but I get the same error with the most recent tag release.
I am running on a ARM mac with gcc-14 and mpich from homebrew. The configure command for exodus is

-DCMAKE_INSTALL_PREFIX=/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g -DCMAKE_INSTALL_NAME_DIR:STRING="/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib" -DCMAKE_INSTALL_LIBDIR:STRING="lib" -DCMAKE_VERBOSE_MAKEFILE=1 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_AR="/usr/bin/ar" -DCMAKE_C_COMPILER="mpicc" -DMPI_C_COMPILER="/opt/homebrew/bin/mpicc" -DCMAKE_RANLIB=/usr/bin/ranlib -DCMAKE_C_FLAGS:STRING="-Wimplicit-function-declaration -Wunused -Wuninitialized -fPIC -g -O0" -DCMAKE_C_FLAGS_DEBUG:STRING="-Wimplicit-function-declaration -Wunused -Wuninitialized -fPIC -g -O0" -DCMAKE_C_FLAGS_RELEASE:STRING="-Wimplicit-function-declaration -Wunused -Wuninitialized -fPIC -g -O0" -DCMAKE_CXX_COMPILER="mpicxx" -DMPI_CXX_COMPILER="/opt/homebrew/bin/mpicxx" -DCMAKE_CXX_FLAGS:STRING="-fno-stack-check -g -O0 -fPIC" -DCMAKE_CXX_FLAGS_DEBUG:STRING="-fno-stack-check -g -O0 -fPIC" -DCMAKE_CXX_FLAGS_RELEASE:STRING="-fno-stack-check -g -O0 -fPIC" -DCMAKE_Fortran_COMPILER="mpif90" -DMPI_Fortran_COMPILER="/opt/homebrew/bin/mpif90" -DCMAKE_Fortran_FLAGS:STRING="-ffree-line-length-none -fallow-argument-mismatch -Wunused -Wuninitialized -fPIC -g -O0 -fallow-argument-mismatch" -DCMAKE_Fortran_FLAGS_DEBUG:STRING="-ffree-line-length-none -fallow-argument-mismatch -Wunused -Wuninitialized -fPIC -g -O0 -fallow-argument-mismatch" -DCMAKE_Fortran_FLAGS_RELEASE:STRING="-ffree-line-length-none -fallow-argument-mismatch -Wunused -Wuninitialized -fPIC -g -O0 -fallow-argument-mismatch" -DBUILD_SHARED_LIBS:BOOL=ON -DBUILD_STATIC_LIBS:BOOL=OFF -DPYTHON_EXECUTABLE:PATH=/opt/homebrew/opt/[email protected]/bin/python3.12 -DPythonInterp_FIND_VERSION:STRING=3.12 -DACCESSDIR:PATH=/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g -DCMAKE_INSTALL_RPATH:PATH=/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -DSeacas_ENABLE_SEACASExodus:BOOL=ON -DSeacas_ENABLE_Fortran:BOOL=ON -DSeacas_ENABLE_SEACASExoIIv2for32:BOOL=ON -DSeacas_ENABLE_SEACASExoIIv2for:BOOL=ON -DSeacas_ENABLE_SEACASExodus_for:BOOL=ON -DSEACASProj_SKIP_FORTRANCINTERFACE_VERIFY_TEST:BOOL=ON -DSeacas_ENABLE_SEACASExodiff:BOOL=OFF -DSeacas_ENABLE_SEACASExotxt:BOOL=OFF -DTPL_ENABLE_Matio:BOOL=OFF -DTPL_ENABLE_Netcdf:BOOL=ON -DTPL_ENABLE_Pnetcdf:BOOL=ON -DTPL_Netcdf_Enables_PNetcdf:BOOL=ON -DTPL_ENABLE_MPI:BOOL=ON -DTPL_ENABLE_Pamgen:BOOL=OFF -DTPL_ENABLE_CGNS:BOOL=OFF -DTPL_ENABLE_fmt=OFF -DNetCDF_DIR:PATH=/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g -DHDF5_DIR:PATH=/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g -DPnetcdf_LIBRARY_DIRS:PATH=/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -DPnetcdf_INCLUDE_DIRS:PATH=/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/include -DSEACASExodus_ENABLE_SHARED:BOOL=ON -DCMAKE_SHARED_LINKER_FLAGS:STRING="-Wl,-rpath,/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -L/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -lnetcdf -Wl,-rpath,/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -L/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -lpnetcdf -Wl,-rpath,/opt/homebrew/Cellar/mpich/4.2.1/lib -Wl,-rpath,/opt/homebrew/Cellar/mpich/4.2.1/lib -L/opt/homebrew/Cellar/mpich/4.2.1/lib -lmpifort -lmpi -lpmpi -lgfortran -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc/aarch64-apple-darwin23/14 -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc/aarch64-apple-darwin23/14 -L/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc/aarch64-apple-darwin23/14 -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc -L/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current -L/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current -lemutls_w -lheapt_w -lgfortran -lquadmath -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc/aarch64-apple-darwin23/14 -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current -Wl,-rpath,/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -L/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -lhdf5_hl -lhdf5 -Wl,-rpath,/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -L/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -lz "

and the build command is

mpicc -Wl,-search_paths_first -Wl,-no_compact_unwind -Wl,-no_warn_duplicate_libraries -Wimplicit-function-declaration -Wunused -Wuninitialized -fPIC -g3 -O0  -I/opt/HPC/petsc-sarah/include -I/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/include -I/opt/X11/include      test_close.c  -Wl,-rpath,/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -L/opt/HPC/petsc-sarah/sonoma-gcc14.1-arm64-g/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -Wl,-rpath,/opt/homebrew/Cellar/mpich/4.2.1/lib -L/opt/homebrew/Cellar/mpich/4.2.1/lib -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc/aarch64-apple-darwin23/14 -L/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc/aarch64-apple-darwin23/14 -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc -L/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current/gcc -Wl,-rpath,/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current -L/opt/homebrew/Cellar/gcc/14.1.0_1/lib/gcc/current -lpetsc -llapack -lblas -lexoIIv2for32 -lexodus -lnetcdf -lpnetcdf -lhdf5_hl -lhdf5 -lz -lX11 -lmpifort -lmpi -lpmpi -lgfortran -lemutls_w -lheapt_w -lgfortran -lquadmath -lc++ -o test_close

The full listing of my example is

#include "exodusII.h"
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
  MPI_Comm mpi_comm = MPI_COMM_WORLD;
  MPI_Info mpi_info = MPI_INFO_NULL;

  float  version;

  int CPU_word_size = 0; /* sizeof(float) */
  int IO_word_size  = 0; /* use what is stored in file */
  int exoid,error;

  ex_opts(EX_VERBOSE | EX_ABORT);

  /* Initialize MPI. */
  MPI_Init(&argc, &argv);



  exoid = ex_open("test.exo",                /* filename path */
                          EX_READ,           /* access mode = READ */
                          &CPU_word_size, /* CPU word size */
                          &IO_word_size,    /* IO word size */
                          &version);      /* ExodusII library version */

  error = ex_close(exoid);
  printf("\nafter ex_close, error = %3d\n", error);

  /* open EXODUS II files */
  exoid = ex_open_par("test.exo",        /* filename path */
                          EX_READ,           /* access mode = READ */
                          &CPU_word_size, /* CPU word size */
                          &IO_word_size,    /* IO word size */
                          &version,       /* ExodusII library version */
                          mpi_comm, mpi_info);
  error = ex_close(exoid);
  printf("\nafter ex_close, error = %3d\n", error);

  MPI_Finalize();
  return 0;
}

@gsjaardema
Copy link
Member

Not sure what is wrong. I compiled and ran the code shown above and get this:

13:50 $ mpicc -I../include -L../lib test.c -lexodus -Wl,-rpath ../lib -o test-close

13:50 $ mpiexec -np 4 ./test-close
after ex_close, error =   0
after ex_close, error =   0
after ex_close, error =   0
after ex_close, error =   0
after ex_close, error =   0
after ex_close, error =   0
after ex_close, error =   0
after ex_close, error =   0

✔ ~/src/seacas-parallel/build [master {origin/master}|✚ ⚑ ]

@gsjaardema
Copy link
Member

I don't know if any of the extra libraries are conflicting somehow... You seem to be adding in some fortran-releated exodus libraries and X11 which aren't needed...

-I/opt/X11/include 
-Wl,-rpath,/opt/X11/lib 
-L/opt/X11/lib 
-lpetsc 
-llapack 
-lblas 
-lexoIIv2for32 
-lX11
-lmpifort 
-lgfortran 
-lemutls_w 
-lheapt_w 
-lgfortran 
-lquadmath 
-lc++ 

Those might be needed for your application in general and you are just trying to get a small example to show the bug you are seeing...

@gsjaardema
Copy link
Member

I'm not sure what else to suggest that you try for this. Do the seacas tests work, or are they not being built...

@bourdin
Copy link
Contributor Author

bourdin commented Jul 11, 2024

I am quite lost here...

I have upgraded exodus and pnetcdf to their latest version. I rebuilt exodusII with tests and they all pass, except for the python ones. As far as I can see, however, none of the tests cover ex_open_par
Here is what I added to my cmake comand: -DSEACASExodus_ENABLE_TESTS:BOOL=ON -DSeacas_ENABLE_TESTS:BOOL=ON

I also removed all extra libraries and compiled with
mpicc -I ${PETSC_DIR}/${PETSC_ARCH}/include/ -Wl,-rpath,${PETSC_DIR}/${PETSC_ARCH}/lib -L${PETSC_DIR}/${PETSC_ARCH}/lib -lexodus -lnetcdf -lpnetcdf -lhdf5_hl -lhdf5 -lz -o testclose testclose.c

I can reproduce this behaviour on a macOS and a linux box.

@gsjaardema
Copy link
Member

Not ignoring this issue, but I have not yet been able to reproduce the behavior...

@bourdin
Copy link
Contributor Author

bourdin commented Aug 28, 2024

I also have not had a chance to get into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants