Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(llm): added the process of text2gql in graphrag V1.0 #105

Merged
merged 44 commits into from
Dec 9, 2024

Conversation

vichayturen
Copy link
Contributor

@vichayturen vichayturen commented Nov 5, 2024

address #10

  1. added the process of intelligent generated gremlin retrivecal
  2. added text2gremlin block in rag app

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 5, 2024
@github-actions github-actions bot added the llm label Nov 5, 2024
@dosubot dosubot bot added the enhancement New feature or request label Nov 5, 2024
@simon824
Copy link
Member

After implementing text2gremlin here, can gremlin_generate_web_demo.py be removed?

@imbajin imbajin self-assigned this Nov 12, 2024
# Conflicts:
#	hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py
@vichayturen
Copy link
Contributor Author

After implementing text2gremlin here, can gremlin_generate_web_demo.py be removed?

Yes, and it has been removed in new commit.

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Nov 17, 2024
@imbajin
Copy link
Member

imbajin commented Nov 21, 2024

We should add a flag value under the interface of graph query rag/graph:

  • 1 represents text2gql accurate matching success
  • 0 represents (k-neighbor) generalization matching success
  • -1 represents no relevant graph info

vichayturen and others added 5 commits November 21, 2024 17:14
query_embedding = context["query_embedding"]
else:
query_embedding = self.embedding.get_text_embedding(query)
context["match_result"] = self.vector_index.search(query_embedding, self.num_examples, dis_threshold=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we set dis_threshold to 2? (means always match topK?)

Seems not reasonable

Copy link
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merg it & split it to separate PR for dev

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 9, 2024
@imbajin imbajin merged commit cbfca3c into apache:main Dec 9, 2024
11 checks passed
@imbajin imbajin changed the title feat(llm): added the process of intelligent generated gremlin for retrieval before subgraph retrieval feat(llm): added the process of text2gql in graphrag V1.0 Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer llm python-client size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants