Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lift instance cause Cabal build error: unknown symbol with CFFI but Cabal repl works fine #10651

Open
yourcomrade opened this issue Dec 19, 2024 · 20 comments

Comments

@yourcomrade
Copy link

Describe the bug
When building a Haskell-based Floating Point library for Clash that uses the mpfr library through the hmpfr binding, an error occurs during the cabal build process on both Windows and Linux. On Windows, the error indicates an unknown symbol (mpfr_add) from MPFR library, while on Linux, the error involves an undefined symbol (mpfr_custom_get_size_wrap) located in the chsmpfr.c file in cbits folder, preventing successful compilation. However, running cabal repl works fine without issues on both platforms.
To Reproduce
Steps to reproduce the behavior:

  1. Clone the library
git clone https://github.com/yourcomrade/FloPoCoFloat
  1. Install mpfr library (If don't have)
    On Window with msys2:
pacman -S mingw-w64-clang-x86_64-mpfr

On Linux Ubuntu WSL

sudo apt install libmpfr6
sudo apt install libmpfr-dev
  1. Navigate to Error_branch
git checkout Error_branch
  1. Build project using cabal
  2. Observe the build log error
    Window:
ghc-9.8.2.exe:  | D:\haskell\FloPoCoFloat2\FloPoCoFloat\dist-newstyle\build\x86_64-windows\ghc-9.8.2\FloPoCoFloat-0.1.0.0\build\Data\Number\MPFR\Arithmetic.o: unknown symbol `mpfr_add'
ghc-9.8.2.exe: Could not load Object Code D:\haskell\FloPoCoFloat2\FloPoCoFloat\dist-newstyle\build\x86_64-windows\ghc-9.8.2\FloPoCoFloat-0.1.0.0\build\Data\Number\MPFR\Arithmetic.o.

Linux

!!! systool:linker: finished in 19.90 milliseconds, allocated 10.151 megabytes
/home/minh/.ghcup/ghc/9.8.2/lib/ghc-9.8.2/bin/./ghc-9.8.2: symbol lookup error: /tmp/ghc382583_0/libghc_15.so: undefined symbol: mpfr_custom_get_size_wrap

Expected behavior
The expected behavior is for the cabal build to compile the project successfully without any unknown symbol errors related to mpfr_add on Windows or mpfr_custom_get_size_wrap on Linux. If the cabal build fail, then cabal repl shouldn't work. But this case doesn't happen.
System information

  • Operating system: Window 11, Ubuntu WSL 5.15
  • cabal 3.10.3, ghc 9.8.2 versions, gcc 11.4 in WSL

Additional context
I am building a floating point library for Clash that uses FloPoCo, a floating point core generator. The library uses the FloPoCo floating point format, defined as FoFloat, and integrates with the MPFR library via the hmpfr Haskell bindings. The goal is to provide accurate floating-point operations in Clash-generated HDL code by simulating the floating-point operations in software.
The following code causes the error during the build process:

data FoFloat (wE :: Nat) (wF :: Nat) (rndMode :: M.RoundMode) =
  FoFloat { ext :: (BitVector 2)
          , sign :: Bit
          , exponentVal :: (BitVector wE)
          , fractionalVal :: (BitVector wF)
          , rndModeVal :: (Proxy rndMode)
          }
  deriving (Generic, Typeable, Show, BitPack, Eq, NFDataX, ShowX, Lift)
deriving instance (Lift (Proxy a))
deriving instance (NFDataX (Proxy a))
deriving instance (ShowX (Proxy a))

-- Example usage that triggers the error:
ta = $(lift (1.2 :: FoFloat 4 11 M.Near))

The issue occurs during the cabal build step when compiling the FloPoCoFloat library.

  • On Windows, the error involves mpfr_add, which cannot be found in the object code during the build process.
  • On Linux, the error involves mpfr_custom_get_size_wrap, resulting in a symbol lookup failure when attempting to load shared libraries during the build.
    Despite these errors, the cabal repl command works as expected on both platforms and successfully loads the proto.hs file located in src folder.
    The error appears to be related to the interaction between the hmpfr Haskell bindings and the C-based MPFR library, with missing or improperly linked symbols.
    The issue is consistent across both platforms but manifests with different symbols.
    GitHub repository with error branch: FloPoCoFloat - Error_branch
@geekosaur
Copy link
Collaborator

FWIW, attempting to build here gets me

Configuring library for FloPoCoFloat-0.1.0.0..
Preprocessing library for FloPoCoFloat-0.1.0.0..
Error: cabal-3.10.3.0: can't find source for FPFloat in src,
/home/allbery/Downloads/_t/FloPoCoFloat/dist-newstyle/build/x86_64-linux/ghc-9.8.2/FloPoCoFloat-0.1.0.0/build/autogen,
/home/allbery/Downloads/_t/FloPoCoFloat/dist-newstyle/build/x86_64-linux/ghc-9.8.2/FloPoCoFloat-0.1.0.0/build/global-autogen

Error: cabal-3.10.3.0: Failed to build FloPoCoFloat-0.1.0.0 (which is required
by exe:clashi from FloPoCoFloat-0.1.0.0 and exe:clash from
FloPoCoFloat-0.1.0.0).

And indeed, I can't find FPFloat.hs in the checkout at all, nor anything prefixed FPFloat.

@yourcomrade
Copy link
Author

@geekosaur have you checkout Error_branch?
image

@geekosaur
Copy link
Collaborator

I have, and I have the files you show, but that module is misnamed and will not be found by Cabal or ghc on Linux, whose filesystem is case sensitive. (It will work on Windows.)

@geekosaur
Copy link
Collaborator

This also applies to proto.hs.

@yourcomrade
Copy link
Author

Sorry, but what do u mean by the module is misnamed?

@geekosaur
Copy link
Collaborator

On a case sensitive filesystem such as Linux uses, the filename must match the case of the module name. The files must be named FPFloat.hs and Proto.hs. I have renamed them locally, reproduced the issue with cabal-3.10.3.0, and am now testing with cabal 3.12.1.0.

@yourcomrade
Copy link
Author

That's interesting. I test it in WSL Ubuntu, and it doesn't complain about that.

@geekosaur
Copy link
Collaborator

geekosaur commented Dec 19, 2024

I expect it's still using the NTFS filesystem underneath so you can share files between it and Windows programs. I'm running on Linux ext4, which does care.

@yourcomrade
Copy link
Author

@geekosaur are u able to reproduce the same error with cabal 3.12?

@geekosaur
Copy link
Collaborator

Yes, which makes me think it's not Cabal involved at all; it's more likely something about runtime linking for TH in GHC. I'm trying with ghc-9.8.4 now.

@yourcomrade
Copy link
Author

That's really strange for me. As I think that for cabal repl to work, it must build library successfully and then load it into ghci. So if cabal build fail, the cabal repl also doesn't work.

@geekosaur
Copy link
Collaborator

Looking back, actually it seems to be the system linker vs. the runtime linker, and the runtime linker (which is custom code used by ghci, unrelated to the system linker) is doing something correctly that the system linker isn't.

I failed to build with ghc-9.8.4, with an error in clash-lib pointing to a known incompatibility with earlier releases; I'd need to reinstall 9.8.3. I doubt it'll help, though. Your constraints on base are too tight to try it with 9.6.x or 9.10.x without --allow-older/--allow-newer, which could break things themselves.

@geekosaur
Copy link
Collaborator

Okay, clash-lib hasn't been ported to ghc 9.10 yet, so that doesn't work. --allow-older=base allows ghc-9.6.6 to start building, at least.

@yourcomrade
Copy link
Author

@geekosaur yeah, from their github: https://github.com/clash-lang/clash-compiler . They currently only support up to ghc 9.8 version.

@geekosaur
Copy link
Collaborator

Same error with ghc-9.6.6. I think this is still clearly a GHC issue, or possibly a bug in how hmpfr tells ghc to find the C libraries it requires; Cabal just relays the information a package such as hmpfr declares to ghc.

@yourcomrade
Copy link
Author

Well, I am using hmpfr version 0.4.5 cloning directly from this: https://github.com/michalkonecny/hmpfr . It can build library successfully though. Some how I cannot installed it with cabal for my library so I include all of the files from hmpfr to my library.

@geekosaur
Copy link
Collaborator

I should mention that ghci may use different glue libraries than ghc itself does, although I don't see library-for-ghci in its cabal file so that probably isn't happening here. Also the mpfr library is in a standard location, so I would expect it to be found.

I'm also not seeing hmpfr in my cabal store anywhere, which makes me wonder if it's your dependency on it that's wrong.

In any case, it's past 23:00 here and I'm heading to bed. I would pursue this with the ghc folks.

@geekosaur
Copy link
Collaborator

Some how I cannot installed it with cabal for my library so I include all of the files from hmpfr to my library.

That suggests you're doing it incorrectly and it's not finding the right library as a result.

@yourcomrade
Copy link
Author

yourcomrade commented Dec 19, 2024

@geekosaur , I have test with ghc 9.10.1 on Linux WSL. It works and uses clash-prelude-1.9 and clash-lib-1.9. However, I have to remove the constrain of the base by removing ^ in base >=4.19.1.0 in the cabal file. It still causes me the same error. However, I have looked at this stack issue: commercialhaskell/stack#794 . So I decide to make an object file from my c code and then use it with cabal. I do something like this:

gcc -c cbits/chsmpfr.c -o cbits/chsmpfr.o
cabal build --ghc-options cbits/chsmpfr.o

Now, I don't have any undefined symbol error with mpfr_custom_get_size_wrap defined in my chsmpfr.c file but instead I get undefined error with mpfr_set_str function which is the part of MPFR library.

/home/minh/.ghcup/ghc/9.10.1/lib/ghc-9.10.1/bin/./ghc-9.10.1: symbol lookup error: /tmp/ghc556850_0/libghc_101.so: undefined symbol: mpfr_set_str

It is interesting that when I look at the /tmp/ghc556850_0, I see 2 dynamic shared libraries libghc_1.so and lib_ghc101.so . This doesn't happen if I don't include the object file when building with cabal.

@geekosaur
Copy link
Collaborator

Those are generated internally by ghc as part of compiling, and aren't under cabal's control. One of them is probably an FFI stub, which it probabaly doesn't make if it can't find the object or library being used via FFI, but you'd have to ask the GHC folks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants