You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To resolve the "Could not get parser for language python" error and the TypeError: __init__() takes exactly 1 argument (2 given) when using the CodeSplitter from llama_index.core.node_parser, follow these steps:
Ensure tree_sitter_languages is Installed and Up-to-Date:
Make sure the tree_sitter_languages package is installed in your environment. You can install it using pip:
pip install tree_sitter_languages
Verify that the package is up-to-date to ensure compatibility with the latest features and bug fixes.
Check Language Support:
Confirm that the tree_sitter_languages module supports Python. You can check the list of valid languages on the py-tree-sitter-languages GitHub page.
Verify Parser Initialization:
The error TypeError: __init__() takes exactly 1 argument (2 given) suggests there might be an issue with how the parser is being initialized. Ensure that the get_parser function is being called correctly and that it returns a valid tree_sitter.Parser object.
Custom Parser Object:
If the automatic parser retrieval is not working, you can manually create and pass a parser object to the CodeSplitter:
Bug Description
from llama_index.core.node_parser import CodeSplitter
splitter = CodeSplitter(
language="python",
chunk_lines=40, # lines per chunk
chunk_lines_overlap=15, # lines overlap between chunks
max_chars=1500, # max chars per chunk
)
nodes = splitter.get_nodes_from_documents(md_nodes)
Error
Could not get parser for language python. Check https://github.com/grantjenks/py-tree-sitter-languages#license for a list of valid languages.
TypeError Traceback (most recent call last)
in <cell line: 0>()
1 from llama_index.core.node_parser import CodeSplitter
2
----> 3 splitter = CodeSplitter(
4 language="python",
5 chunk_lines=40, # lines per chunk
/usr/local/lib/python3.11/dist-packages/llama_index/core/node_parser/text/code.py in init(self, language, chunk_lines, chunk_lines_overlap, max_chars, parser, callback_manager, include_metadata, include_prev_next_rel, id_func)
75 import tree_sitter_languages # pants: no-infer-dep
76
---> 77 parser = tree_sitter_languages.get_parser(language)
78 except ImportError:
79 raise ImportError(
tree_sitter_languages/core.pyx in tree_sitter_languages.core.get_parser()
tree_sitter_languages/core.pyx in tree_sitter_languages.core.get_language()
TypeError: init() takes exactly 1 argument (2 given)
Version
latest
Steps to Reproduce
from llama_index.core.node_parser import CodeSplitter
splitter = CodeSplitter(
language="python",
chunk_lines=40, # lines per chunk
chunk_lines_overlap=15, # lines overlap between chunks
max_chars=1500, # max chars per chunk
)
nodes = splitter.get_nodes_from_documents(md_nodes)
Error
Could not get parser for language python. Check https://github.com/grantjenks/py-tree-sitter-languages#license for a list of valid languages.
TypeError Traceback (most recent call last)
in <cell line: 0>()
1 from llama_index.core.node_parser import CodeSplitter
2
----> 3 splitter = CodeSplitter(
4 language="python",
5 chunk_lines=40, # lines per chunk
/usr/local/lib/python3.11/dist-packages/llama_index/core/node_parser/text/code.py in init(self, language, chunk_lines, chunk_lines_overlap, max_chars, parser, callback_manager, include_metadata, include_prev_next_rel, id_func)
75 import tree_sitter_languages # pants: no-infer-dep
76
---> 77 parser = tree_sitter_languages.get_parser(language)
78 except ImportError:
79 raise ImportError(
tree_sitter_languages/core.pyx in tree_sitter_languages.core.get_parser()
tree_sitter_languages/core.pyx in tree_sitter_languages.core.get_language()
TypeError: init() takes exactly 1 argument (2 given)
Relevant Logs/Tracbacks
The text was updated successfully, but these errors were encountered: