Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVTX v3 dynamic library segfaults during initialization on Windows #107

Open
maleadt opened this issue Nov 13, 2024 · 9 comments
Open

NVTX v3 dynamic library segfaults during initialization on Windows #107

maleadt opened this issue Nov 13, 2024 · 9 comments
Assignees
Milestone

Comments

@maleadt
Copy link

maleadt commented Nov 13, 2024

I'm using NVTX from Julia, so I compile NVTX into a shared library as documented in https://github.com/NVIDIA/NVTX?tab=readme-ov-file#other-languages. This works fine on most platforms, but on Windows (where we build our binaries using MinGW compilers) NVTX segfaults during initialization. This reproduces in C, so doesn't look like a Julia issue. I'm using msys2 on Window 11, with the compilers from the mingw-w64-x86_64-toolchain package.

To generate nvToolsExt.dll, I do:

$ git clone https://github.com/NVIDIA/NVTX -b dev

$ cat >nvtx.c <<EOD
#define NVTX_EXPORT_API
#include <nvtx3/nvToolsExt.h>
#include <nvtx3/nvToolsExtCuda.h>
EOD

$ cc -g -shared -I /path/to/cuda/include -I ./c/include -o nvToolsExt.dll nvtx.c

To simulate what we do in Julia, I'm using the following simple loader:

#include <stdio.h>
#include <windows.h>

typedef void (*NvtxInitializeFunc)(void*);

int main() {
    // Load the DLL
    HMODULE hLib = LoadLibraryA("nvToolsExt.dll");
    if (hLib == NULL) {
        fprintf(stderr, "Failed to load libnvToolsExt.dll. Error code: %lu\n", GetLastError());
        return 1;
    }

    // Get the function pointer
    NvtxInitializeFunc nvtxInitialize = (NvtxInitializeFunc)GetProcAddress(hLib, "nvtxInitialize");
    if (nvtxInitialize == NULL) {
        fprintf(stderr, "Failed to get nvtxInitialize function. Error code: %lu\n", GetLastError());
        FreeLibrary(hLib);
        return 1;
    }

    // Call the function with NULL parameter
    nvtxInitialize(NULL);

    // Clean up
    FreeLibrary(hLib);

    return 0;
}
$ cc -g loader.c -o loader && ./loader.exe
zsh: segmentation fault  ./loader.exe

GDB reveals it indeed fails during initialization:

(gdb) bt
#0  0x00007ffcaa134a7e in ?? ()
#1  0x00007ffcaa131a0f in nvtxInitialize (reserved=0x0) at ./c/include/nvtx3/nvtxDetail/nvtxImplCore.h:317
#2  0x00007ff60c631508 in main () at loader.c:23

x-ref JuliaGPU/NVTX.jl#37, also filed as NVIDIA bug 4262590.

@evanramos-nvidia
Copy link
Collaborator

nvToolsExt.dll is deprecated as of NVTXv3. Work is in progress to create a replacement DLL for cross-language interoperability. We will keep you updated.

@maleadt
Copy link
Author

maleadt commented Nov 20, 2024

nvToolsExt.dll is deprecated

I'm building my own version based on the code in this repository, it only happens to have the same name:

To generate nvToolsExt.dll, I do:

$ git clone https://github.com/NVIDIA/NVTX -b dev

$ cat >nvtx.c <<EOD
#define NVTX_EXPORT_API
#include <nvtx3/nvToolsExt.h>
#include <nvtx3/nvToolsExtCuda.h>
EOD

$ cc -g -shared -I /path/to/cuda/include -I ./c/include -o nvToolsExt.dll nvtx.c

@evanramos-nvidia
Copy link
Collaborator

Thanks for your report. A fix for building with NVTX_EXPORT_API is in progress (the same task I mentioned for a new DLL).

@maleadt
Copy link
Author

maleadt commented Dec 5, 2024

Any branch I could test, or isn't the work public yet?

@evanramos-nvidia evanramos-nvidia added this to the v3.1.1 milestone Dec 11, 2024
@evanramos-nvidia
Copy link
Collaborator

@maleadt:

Any branch I could test, or isn't the work public yet?

I've updated the dev branch with the work we've done for this if you want to try it.

@maleadt
Copy link
Author

maleadt commented Feb 4, 2025

Thanks. I'd love to try it out, but building for x86_64-w64-mingw32 now fails:

[11:58:16]  ---> ${CC} -std=c99 -O2 ${CFLAGS} -shared ${LIBS} -I${prefix}/cuda/include -I${WORKSPACE}/srcdir/NVTX/c/include -o ${libdir}/libnvToolsExt.${dlext} nvtx.c
[11:58:17] /tmp/cchfcKcA.o:nvtx.c:(.data$nvtxGlobals_v3[nvtxGlobals_v3]+0x10): undefined reference to `nvtxEtiGetModuleFunctionTable_v3'
[11:58:17] /tmp/cchfcKcA.o:nvtx.c:(.data$nvtxGlobals_v3[nvtxGlobals_v3]+0x28): undefined reference to `nvtxEtiSetInjectionNvtxVersion_v3'
[11:58:17] /opt/x86_64-w64-mingw32/bin/../lib/gcc/x86_64-w64-mingw32/4.8.5/../../../../x86_64-w64-mingw32/bin/ld: /tmp/cchfcKcA.o: bad reloc address 0x28 in section `.data$nvtxGlobals_v3[nvtxGlobals_v3]'
[11:58:17] collect2: error: ld returned 1 exit status

@evanramos-nvidia
Copy link
Collaborator

@maleadt Thanks for the report. I was able to reproduce the issue locally, and I pushed what I believe to be the fix to the dev branch in commit b7cc77f. Please see if that allows the build to succeed.

@maleadt
Copy link
Author

maleadt commented Feb 11, 2025

Thanks, that allows the build to succeed, and fixes the issue on Windows.

Can you share the timeline for this to be part of a release? Or alternatively, how stable is the dev branch to use (and which version number will it get)?

@maleadt maleadt closed this as completed Feb 11, 2025
@evanramos-nvidia
Copy link
Collaborator

v3.1.1 will release very soon. I will reopen this ticket pending its publication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants