-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide convenience method for loading language library #23
Comments
The plan is to eventually integrate those bindings into the parsers (see tree-sitter/tree-sitter-java#182). |
But that is specifically for tree-sitter-java, right? That would certainly be useful, but I was thinking of a more general solution for all parsers, e.g. Python, JSON, ... since they all have a |
The CLI will generate bindings for all parsers like it does for other languages. |
Ah, I think I misunderstood you. Is the plan to generate Java bindings for all parsers, and the tree-sitter-java one was just an example? That would be great then! But would it make sense nonetheless to add a generic I was thinking of something like this: public final class Language {
/**
* Loads a language using the given symbol lookup for the native library.
* For example:
* {@snippet lang=java :
* Path pathToLibrary = Path.of("libtree-sitter-python.so");
* SymbolLookup libraryLookup = SymbolLookup.libraryLookup(pathToLibrary, Arena.ofAuto());
* Language language = Language.loadLanguage(libraryLookup, "python");
* }
*
* @throws IllegalArgumentException If the Tree-sitter language function cannot be found using the symbol lookup
*/
public static Language loadLanguage(SymbolLookup symbolLookup, String languageName) throws IllegalArgumentException {
String functionName = "tree_sitter_" + languageName;
MemorySegment functionAddress = symbolLookup.find(functionName)
.orElseThrow(() -> new IllegalArgumentException("Language function '%s' not found".formatted(functionName)));
var voidPtr = ValueLayout.ADDRESS.withTargetLayout(MemoryLayout.sequenceLayout(Long.MAX_VALUE, ValueLayout.JAVA_BYTE));
var funcDesc = FunctionDescriptor.of(voidPtr);
var function = Linker.nativeLinker().downcallHandle(functionAddress, funcDesc);
MemorySegment languagePointer;
try {
languagePointer = ((MemorySegment) function.invokeExact()).asReadOnly();
} catch (Throwable t) {
throw new RuntimeException("Failed to call language function", t);
}
return new Language(languagePointer);
}
/**
* Creates a new instance from the given language pointer.
*
* <p>Normally you don't have to obtain the language pointer yourself. Instead, you can either use the
* generated Java bindings for a parser, for example:
* {@snippet lang=java :
* var pointer = TreeSitterPython.language();
* Language language = new Language(pointer);
* }
* Or you can use {@link #loadLanguage(SymbolLookup, String)} to obtain a {@code Language} instance.
*
* @implNote It is up to the caller to ensure that the pointer is valid.
*
* @throws IllegalArgumentException If the language version is incompatible.
*/
public Language(MemorySegment address) {
// ...
}
// ...
} The Javadoc here intentionally refers to tree-sitter-python to reduce confusion and to indicate that it works with any parser; otherwise a user might confuse tree-sitter-java with java-tree-sitter / jtreesitter, or think this jtreesitter only works with the Java parser. |
Only until the bindings are autogenerated, at which point it'll be deprecated. |
(Please correct me if anything of the following is wrong)
If I understand it correctly, for all parser implementations there is always a
tree_sitter_<lang>
function, and it always has the same signature.Currently jtreesitter only provides a
Language(MemorySegment)
constructor, so you have to generate boilerplate code which looks up thetree_sitter_<lang>
function and invokes it (as done in the test code). This can be an obstacle for new users of jtreesitter because they either have to be a bit familiar withjava.lang.foreign
, or blindly copy code they don't understand.It would be useful if
Language
provided a convenience method for this, for example:The user could then easily use
SymbolLookup#libraryLookup
to load the library and then use thatLanguage#loadLanguage
method.If you want I can try to create a proof-of-concept PR for this.
The text was updated successfully, but these errors were encountered: