fix: resolve JSON parsing issues and implement RRF fusion#24
fix: resolve JSON parsing issues and implement RRF fusion#24arybhatt4533 wants to merge 1 commit intoINCF:mainfrom
Conversation
|
hi arybhatt !! Do you know who is mentoring this repo of the INCF ? repo of Knowledge space agent ? who is mentor ? how to reach .. etc ? |
|
Hi @Areeba-Tahir-18 |
|
Thank you !! Yess i am trying to reach out from a month . Alomost explored websites , tried to contact them via email , but still................... no response . Let's see |
|
Totally understand — reaching out can be slow sometimes. INCF and similar orgs often take time to respond, especially outside official GSoC timelines. |
|
Yess !! I am exploring and doing all this for GSOC 2026 like you . Will open my pull request soon... Thanks @arybhatt4533 |
Summary of Changes
I have implemented several fixes to improve the backend's stability and search result ranking.
Key Updates:
Robust JSON Parsing: Added a clean_and_parse_json utility to handle cases where the LLM returns markdown formatting or extra text. This prevents the system from crashing during keyword and intent extraction.
Search Ranking (RRF): Integrated Reciprocal Rank Fusion (RRF) logic to merge results from Knowledge Space and Vector searches. This ensures that the most relevant datasets from both sources are prioritized in the final response.
Lightweight Testing (Torch Bypass): Added an is_enabled flag and commented out the Retriever import in VectorSearchAgent. This allows for local testing and API validation without needing to download heavy torch dependencies.
Intent Handling: Refined the logic to better distinguish between general greetings and actual data discovery queries, preventing unnecessary search triggers.
Verification: The backend has been verified using the FastAPI Swagger UI. The API returns a 200 OK status, and the fusion logic correctly processes search results even with the vector bypass active.
Closes #8