DeepGit is an advanced, Langgraph-based agentic workflow designed to perform deep research across GitHub repositories. It intelligently searches, analyzes, and ranks repositories based on user intent—even uncovering less-known but highly relevant tools. DeepGit infuses hybrid dense retrieval with advanced cross-encoder re-ranking and comprehensive activity analysis into a unified, open-source platform for intelligent repository discovery
DeepGit-lite is a lightweight version of DeepGit running on zero GPU on Hugging Face Space here.
It may not perform as well as the full version, but it's great for a quick first-hand preview.
When a user submits a query, the DeepGit Orchestrator Agent takes over, passing the query through a series of specialized tools:
-
Query Expansion Tool
Enhances vague user queries using language models to add specificity and context, enabling more accurate downstream retrieval. -
Semantic Retrieval Tool
Leverages cutting-edge embedding models to semantically match the enhanced query against a wide array of GitHub repositories. -
Documentation Intelligence Tool
Scrapes and interprets repository documentation (e.g., README files and additional markdowns) to understand the purpose, setup, and key features. -
Codebase Mapping Tool
Analyzes the project’s file structure and technology stack to assess complexity, modularity, and suitability for the user’s needs. -
Community Insight Tool
Aggregates social signals such as stars, forks, issues, and pull request activity to gauge real-world engagement and maturity. -
Relevance Synthesis Tool
Combines insights from all modules to compute a final relevance score tailored to the user query. -
Insight Delivery Module
Presents a ranked list of repositories with concise summaries and justifications, enabling smart discovery.
-
Uncover Hidden Gems:
Surface powerful but under-the-radar open-source tools. -
Empower Research:
Build an intelligent discovery layer over GitHub tailored for research-focused developers. -
Promote Open Innovation:
Open-source the entire workflow to foster transparency and collaboration in research.
DeepGit provides an intuitive interface for exploring repository recommendations. The main page where users enter raw natural language query. This is the primary interaction point for initiating deep semantic searches.
Output: Showcases the tabulated results with clickable links and different threshold scores, making it easy to compare and understand the ranking criteria.
- Python: 3.11+ (The repo has been tested on Python 3.11.x)
- pip: 24.0+ (Ensure you have an up-to-date pip version)
git clone https://github.com/zamalali/DeepGit.git
cd DeepGitpython3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install --upgrade pippip install -r requirements.txtTo run DeepGit locally, simply execute:
python app.py- Python Version: Use Python 3.11 or higher as the repo has been tested on Python 3.11.x.
- pip Version: Make sure you’re running pip 24.0 or later.
- Dependency Issues: If you encounter any, try reinstalling in a new virtual environment.
For a detailed documentation on using DeepGit, Check out here.
DeepGit leverages Langgraph for orchestration. To launch the Langsmith dashboard and start the workflow, simply run:
langgraph devThis command opens the Langsmith dashboard where you can enter your raw queries in a JSON snippet and monitor the entire agentic workflow.
For instructions on using Docker with DeepGit, please refer to our Docker Documentation.


