Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Need Clarification on Java/Scala to Native C++ Handover and Profiling Native C++ Files #8419

Open
ajeyabsfujitsu opened this issue Jan 3, 2025 · 11 comments
Labels
enhancement New feature or request

Comments

@ajeyabsfujitsu
Copy link

Description

I am exploring the incubator-gluten repository and trying to understand how Java/Scala code interacts with the native C++ code. However, I am having difficulty identifying where and how this handover happens within the codebase.

An explanation on this process would be extremely helpful.

Java/Scala to C++ Handover - I want to know the specific parts of the codebase where Java/Scala calls are handed over to native C++ code. I’m unable to locate the key files or classes where this interaction is defined. If there are specific entry points or patterns used for this handover, please provide details.

Profiling C++ Code - I’m attempting to use perf to profile the execution and see which C++ functions or files are being called. However, the perf output only shows functions from jibjvm.so or libc.so, and no Gluten or Velox-specific C++ functions are visible. Are there additional configurations or profiling tools recommended for this?

@ajeyabsfujitsu ajeyabsfujitsu added the enhancement New feature or request label Jan 3, 2025
@FelixYBW
Copy link
Contributor

FelixYBW commented Jan 3, 2025

https://github.com/apache/incubator-gluten/tree/main/cpp/core/jni is the place where java call c++

if your gluten is running, from perf you can see the hotspot of velox functions

@ajeyabsfujitsu
Copy link
Author

ajeyabsfujitsu commented Jan 4, 2025

Thanks for the response.

In perf call graph everything is just libjvm and libc. Is there something I am missing?
perf_velox

In async profiler, I get the java/scala functions called, but I want to know at what point native functions take over.

In the files in cpp/core/jni folder, should I look out mainly for "JNI_EXPORT" implementations?

@FelixYBW
Copy link
Contributor

FelixYBW commented Jan 4, 2025

you may go through all the files in the jni folder. Any JNI call can trigger the native function.

this folder is for Velox backend:
https://github.com/apache/incubator-gluten/tree/main/cpp/velox/jni

@zhztheplayer
Copy link
Member

Java/Scala to C++ Handover

The code used for C++ / Java communication is usually written in files that have "JniWrapper" as names or name suffixes. VeloxJniWrapper.cc, ...JniWrapper.scala, etc.

In perf call graph everything is just libjvm and libc. Is there something I am missing?

Not sure, but perhaps something like this might happen when the debug symbols are missing from the profiler. Note Gluten deletes the used native library files from tmp/ while the JVM is exiting. Could set spark.gluten.sql.debug.keepJniWorkspace=true to avoid the deletion if it's the reason.

@ajeyabsfujitsu
Copy link
Author

@FelixYBW Thanks. I will go through that folder to try finding the calls.

@ajeyabsfujitsu
Copy link
Author

@zhztheplayer Thanks for the response.

Debug symbols were installed with this -

apt install -yq libc6-dbg libstdc++6-12-dbg openjdk-${JDK_VERSION}-dbg

Is there any other command that you are aware of?

I am trying to run tpch workload runs.

spark.gluten.sql.debug.keepJniWorkspace=true

I tried setting this flag in tpch_parquet.sh but getting an error

java.util.NoSuchElementException: spark.gluten.sql.debug.keepJniWorkspaceDir

Should this flag be added somewhere else?

@VaibhavFRI
Copy link

@ajeyabsfujitsu you should set spark.gluten.sql.debug.keepJniWorkspaceDir=<path_to_dir> conf also for spark.gluten.sql.debug.keepJniWorkspace=true to work.

@ajeyabsfujitsu
Copy link
Author

Thanks Vaibhav.

@VaibhavFRI
Copy link

VaibhavFRI commented Jan 6, 2025

@zhztheplayer I tried setting the spark.gluten.sql.debug.keepJniWorkspace=true conf, but I still get libjvm.so and no gluten function calls. Is anything else needs to be set in perf?
Screenshot 2025-01-06 122718

@rajatma1993
Copy link

@zhztheplayer @FelixYBW

Based on your suggestions, I analyzed the file "Jniwrapper" to trace the Java calls that callback to C++. However, I noticed that most of the Gluten functions are pointing to Scala files rather than Java, which should ultimately lead to the Jniwrapper.

I would like to understand the flow of Gluten functions that transition from Java to C++. Could you please guide me on which folders and files I should refer to for better clarity?

@brijrajk
Copy link

@rajatma1993 You should check the $GLUTEN_HOME/cpp/core/jni/JniWrapper.cc to understand where the handover of native calls takes place.

So for example

image

the functionName : Java_org_apache_gluten_vectorized_PlanEvaluatorJniWrapper_nativeCreateKernelWithIterator

will give you reference from where it is being called. If you split the above function name by "_"
you will get
Java, org, apache, gluten, vectorized, PlanEvaluatorJniWrapper, nativeCreateKernelWithIterator.

so the class name from which it is called :
org.apache.gluten.vectorized.PlanEvaluatorJniWrapper.java and the function name : nativeCreateKernelWithIterator

Now you can search the above class into gluten code
image

this file is located at : $GLUTEN_HOME/gluten-data/src/main/java/org/apache/gluten/vectorized/PlanEvaluatorJniWrapper.java

Hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants