Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native libraries are searched in lib/polyglot project dir #11874

Merged
merged 43 commits into from
Jan 6, 2025

Conversation

Akirathan
Copy link
Member

@Akirathan Akirathan commented Dec 16, 2024

Fixes #11483

Pull Request Description

Native libraries used via JNI from polyglot Java code can be located in polyglot/lib directories in projects. New docs is in

## Native libraries
Java can load native libraries using, e.g., the
[System.loadLibrary](<https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/System.html#loadLibrary(java.lang.String)>)
or
[ClassLoader.findLibrary](<https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/ClassLoader.html#findLibrary(java.lang.String)>)
methods. If a Java method loaded from the `polyglot/java` directory in project
`Proj` tries to load a native library via one of the aforementioned mechanisms,
the runtime system will look for the native library in the `polyglot/lib`
directory within the project `Proj`. The runtime system implements this by
overriding the
[ClassLoader.findLibrary](<https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/ClassLoader.html#findLibrary(java.lang.String)>)
method on the `ClassLoader` used to load the Java class.
The algorithm used to search for the native libraries within the `polyglot/lib`
directory hierarchy conforms to the
[NetBeans JNI specification](https://bits.netbeans.org/23/javadoc/org-openide-modules/org/openide/modules/doc-files/api.html#jni):
Lookup of library with name `native` works roughly in these steps:
- Add platform-specific prefix and/or suffix to the library name, e.g.,
`libnative.so` on Linux.
- Search for the library in the `polyglot/lib` directory.
- Search for the library in the `polyglot/lib/<arch>` directory, where `<arch>`
is the name of the architecture.
- Search for the library in the `polyglot/lib/<arch>/<os>` directory, where
`<os>` is the name of the operating system.

Important Notes

Standard.Image depends on opencv.jar which contains all native libraries for all platforms:

> tree opencv-4.7.0-0/nu/pattern/opencv
opencv-4.7.0-0/nu/pattern/opencv/
├── linux
│   ├── ARMv7
│   │   ├── libopencv_java470.so
│   │   └── README.md
│   ├── ARMv8
│   │   ├── libopencv_java470.so
│   │   └── README.md
│   ├── x86_32
│   │   └── README.md
│   └── x86_64
│       ├── libopencv_java470.so
│       └── README.md
├── osx
│   ├── ARMv8
│   │   ├── libopencv_java470.dylib
│   │   └── README.md
│   └── x86_64
│       ├── libopencv_java470.dylib
│       └── README.md
└── windows
    ├── x86_32
    │   ├── opencv_java470.dll
    │   └── README.md
    └── x86_64
        ├── opencv_java470.dll
        └── README.md

12 directories, 15 files

All native libraries have 352 MB, but we only need a single one. For example:

> du -sh opencv-4.7.0-0/nu/pattern/opencv
352M	opencv-4.7.0-0/nu/pattern/opencv
> du -sh opencv-4.7.0-0/nu/pattern/opencv/linux/x86_64/
62M	opencv-4.7.0-0/nu/pattern/opencv/linux/x86_64/

In this PR, sbt extracts all the native libraries from opencv.jar and puts them into Standard/Image/.../polyglot/lib directory. In subsequent PRs, we may want to drop all the native libraries for different platforms.

Native image building modification

EnsoLibraryFeature called during native image build scans all the native libraries inside std libraries, and copies them next to the generated executable. That way, NI will find the library without HostClassLoader or without changing java.library.path system property.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • MacOS engine tests succeeds on Image_Tests on native image
  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.
  • If meaningful changes were made to logic or tests affecting Enso Cloud integration in the libraries,
    or the Snowflake database integration, a run of the Extra Tests has been scheduled.
    • If applicable, it is suggested to paste a link to a successful run of the Extra Tests.

@Akirathan Akirathan self-assigned this Dec 16, 2024
@Akirathan Akirathan added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Dec 16, 2024
@Akirathan Akirathan changed the title Native libraries are search in lib/polyglot project dir Native libraries are searched in lib/polyglot project dir Dec 17, 2024
@Akirathan
Copy link
Member Author

Akirathan commented Dec 17, 2024

It was a mistake to try to create a different HostClassLoader for each project. The underlying exception of https://github.com/enso-org/enso/actions/runs/12372797379/job/34531804458?pr=11874#step:7:5854 is NoClassDefFoundError: org/enso/base/polyglot/Polyglot_Utils throw from

Object converted = Polyglot_Utils.convertPolyglotValue(v);
. Column class was loaded by a HostClassLoader for Standard.Table pkg, whereas Polyglot_Utils was loaded by a HostClassLoader for Base pkg. Let's just stick with a single HostClassLoader that will override findLibrary.

Reverting in 1649ddf and 05851f5

GitHub
Enso Analytics is a self-service data prep and analysis platform designed for data teams. - Native libraries are searched in lib/polyglot project dir · 84be175

@Akirathan
Copy link
Member Author

Akirathan commented Dec 23, 2024

Native image build of engine-runner correctly registers native libraries even for Windows - https://github.com/enso-org/enso/actions/runs/12470087920/job/34804596736?pr=11874#step:7:1963. The Image_Tests for Windows on native image succeeds

GitHub
Enso Analytics is a self-service data prep and analysis platform designed for data teams. - Native libraries are searched in lib/polyglot project dir · a9c60a5

@Akirathan
Copy link
Member Author

What is the size of the bin/enso native executable on your branch? Does it go down to ~200MB? E.g. has the effect of

* [Including opencv native libraries as resources in enso runner binary #11807](https://github.com/enso-org/enso/pull/11807)

reverted back?

Size of enso binary on develop: 266 MB, size on this PR: 204 MB. So yes, the libopencv.so is not included in the binary.

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we should not be doing anything special in EnsoLibraryFeature.

assert dir.exists() && dir.isDirectory();
nativeLibPaths.add(dir.getAbsolutePath());
var current = System.getProperty("java.library.path");
RuntimeSystemProperties.register(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary neither good idea.

Please remember that NI build runs in build time, while the .so, .dll files are about to be located in runtime. The paths are going to be different.

Can you tell me: What was failing and why you think this change is helpful? We don't need anything like this to load .so for enso_parser...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For enso_parser, we have full control over explicit search for the library in Parser$Worker$initializeLibraries - in that method, we try to explicitly load the enso_parser native library first with System.loadLibrary, and then with System.load.

For opencv, we don't have full control. Its native library opencv_java470 is loaded in the static initializer of nu.pattern.OpenCV with System.loadLibrary("opencv_java470"). OpenCV class is used by classes in our std-bits/image. In JVM mode, all classes from std-bits/image (classes in package org.enso.image) are loaded by HostClassLoader, therefore, also OpenCV class is loaded by HostClassLoader, and so, System.loadLibrary("opencv_java470") delegates to HostClassLoader.findLibrary. In native image, there is no HostClassLoader, and we have to deal with the System.loadLibrary call inside OpenCV differently.

The solution to include path to the native library inside RuntimeSystemProperties from NI build time was the only solution I could think of. It points to a path like distribution/lib/Standard/Image/0.0.0-dev/polyglot/lib/... where the native libraries are located. All the tests are currently passing, because we test NI only via cmdline, and we don't try to package the NI inside the AppImage. It is definitely not the most robust solution. I am opened to any alternatives.

TL;DR; Is there any other way how to hook into System.loadLibrary calls during runtime in NI other than setting java.library.path via RuntimeSystemProperties during NI build?

Copy link
Member

@JaroslavTulach JaroslavTulach Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In native image, there is no HostClassLoader,

Really? How comes there is no HostClassLoader? Of course it cannot load any new classes, but the classloader can/should still be there...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of the use of HostClassLoader. Still some problems there, but this way we should be able to get HostClassLoader "into the game".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried multiple times to somehow get HostClassLoader "into the game", but failed after many attempts. See #11874 (comment) . The alternative solution mentioned in #11874 (comment) is much easier and seems to work. It does not modify configuration of NI at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying the libraries seems like a reasonable and portable fix. Let's use it for now and keep the findLibrary approach for later (for example #7082).

docs/polyglot/java.md Show resolved Hide resolved
Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build time vs. runtime classpath issue has to be solved.

docs/polyglot/java.md Outdated Show resolved Hide resolved
assert dir.exists() && dir.isDirectory();
nativeLibPaths.add(dir.getAbsolutePath());
var current = System.getProperty("java.library.path");
RuntimeSystemProperties.register(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of the use of HostClassLoader. Still some problems there, but this way we should be able to get HostClassLoader "into the game".

@Akirathan
Copy link
Member Author

Akirathan commented Dec 31, 2024

After many failed attempts to somehow enforce loading of native libraries via HostClassLoader.findLibrary at runtime in NI mode, my latest attempt is at 7bd6ed6 on branch https://github.com/enso-org/enso/tree/wip/akirathan/11483-polyglot-lib-2, when NI is build on that branch, enso --run test/Image_Tests fails with:

$ enso --run test/Image_Tests
[enso.py] Running /home/pavel/dev/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso --run test/Image_Tests
[WARN] [2024-12-31T17:55:34+01:00] [enso.org.enso.interpreter.runtime.EnsoContext] Initializing the context in a different working directory than the one containing the project root. This may lead to relative paths not behaving as advertised by `File.new`. Please run the engine inside of `/home/pavel/dev/enso/test` directory.
[HostClassLoader] loadClass: org.enso.base.Environment_Utils
[HostClassLoader] loadClass: org.enso.image.data.Matrix
[HostClassLoader] {3} All loaded classes: []
[HostClassLoader] Class 'org.enso.image.data.Matrix' loaded by class loader null
[HostClassLoader] ret.getProtectionDomain().getCodeSource().getLocation() = file:/home/pavel/dev/enso/built-distribution/enso-engine-0.0.0-dev-linux-amd64/enso-0.0.0-dev/bin/enso
Execution finished with an error: org.opencv.core.Mat.n_zeros(III)J [symbol: Java_org_opencv_core_Mat_n_1zeros or Java_org_opencv_core_Mat_n_1zeros__III]
        at <java> org.graalvm.nativeimage.builder/com.oracle.svm.core.jni.access.JNINativeLinkage.getOrFindEntryPoint(JNINativeLinkage.java:152)
        at <java> org.graalvm.nativeimage.builder/com.oracle.svm.core.jni.JNIGeneratedMethodSupport.nativeCallAddress(JNIGeneratedMethodSupport.java:54)
        at <java> org.opencv.core.Mat.n_zeros(Native Method)
        at <java> org.opencv.core.Mat.zeros(Mat.java:733)
        at <java> org.enso.image.data.Matrix.zeros(Matrix.java:22)
        at <enso> Matrix.type.zeros<arg-1>(Matrix.enso:42:23-61)
        at <enso> Matrix.type.zeros(Matrix.enso:42:9-62)
        at <enso> Matrix_Spec.add_specs.Matrix_Spec.add_specs(src/Data/Matrix_Spec.enso:17:17-32)
        at <enso> case_branch(Suite.enso:37:17-32)
        at <enso> Suite_Builder.group(Suite.enso:35-42)
        at <enso> Matrix_Spec.add_specs(src/Data/Matrix_Spec.enso:16-115)
        at <enso> Main.main.suite(src/Main.enso:12:9-43)
        at <enso> Test.build(Test.enso:24:9-33)
        at <enso> Main.main(src/Main.enso:10-13)
GitHub
Enso Analytics is a self-service data prep and analysis platform designed for data teams. - GitHub - enso-org/enso at wip/akirathan/11483-polyglot-lib-2

@Akirathan
Copy link
Member Author

Akirathan commented Dec 31, 2024

Different idea to try: EnsoLibraryFeature will discover all native libs from distribution/lib/Standard/... and copy them next to the generated binary from NI. The native library will be located next to other native libraries that NI produces by default, like libawt.so. That way, we don't need to do anything with NI configuration, should be enough to just change EnsoLibraryFeature.

Done in e6956db

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am glad to see the copying of native libs next to enso binary working. It avoids hard coding the paths of the build machine and seems like a good solution for now.

@Akirathan Akirathan merged commit 7658faf into develop Jan 6, 2025
48 checks passed
@Akirathan Akirathan deleted the wip/akirathan/11483-polyglot-lib branch January 6, 2025 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wasting 100MB by Standard.Image: Implement polyglot/lib
3 participants