AI-Powered RFP Analysis & Q&A System

Project Overview

This project implements a sophisticated system for analyzing Request for Proposal (RFP) documents. It leverages Retrieval Augmented Generation (RAG) to allow users to upload RFP files (PDFs, text) and ask natural language questions about their content. The system retrieves relevant sections from the document and uses a Large Language Model (LLM) to generate concise, context-aware answers. This core Q&A functionality is designed to help users quickly understand complex RFP requirements and find specific information efficiently.

Beyond general Q&A, the system features a dedicated module for structured data extraction from RFPs. This module uses an LLM, guided by Pydantic schemas, to identify and pull out key pieces of information such as submission deadlines, formatting requirements (page limits, font specifications), and crucial eligibility criteria. This extracted data is presented in a clear, organized format on the frontend, providing users with an immediate overview of critical RFP parameters, complementing the more free-form RAG interaction.

Workflow Overview

The application follows a dual-path workflow for document interaction: RAG-based Q&A and Structured Data Extraction.

graph TD
    A["Frontend: User Uploads RFP"] --> B{"Backend: Receive & Store RFP"};
    B --> C{"Backend: Process Full Text"};

    subgraph "RAG Q&A Path"
        D["Frontend: User Submits Query"] --> E{"Backend: Receive Query"};
        C --> F["Backend: Hybrid Search Engine"];
        F --> G["1. BM25 Keyword Search"];
        F --> H["2. Semantic Vector Search"];
        G --> I["Combine Results"];
        H --> I;
        I --> J["Backend: Cohere Re-ranker"];
        J --> K["Backend: Prepare Context"];
        E --> L["Backend: Ollama LLM - RAG Model"];
        K --> L;
        L --> M["Frontend: Display RAG Answer"];
    end

    subgraph "Structured Extraction Path"
        N["Frontend: User Clicks 'Analyze RFP'"] --> O{"Backend: Trigger Analysis"};
        C --> P["Backend: Split Text into Chunks"];
        P --> Q{"Loop Per Chunk"};
        Q -- "RFP Chunk" --> R["Backend: Ollama LLM - Analysis Model"];
        R -- "Targeted Prompts (Submission, Formatting, Eligibility)" --> S["Parse & Validate (Pydantic)"];
        S -- "Structured Data per Category" --> T["Aggregate Category Data"];
        Q -- "Next Chunk" --> R;
        T -- "After All Chunks" --> U["Backend: Merge Aggregated Data"];
        U --> V["Frontend: Display Structured RFP Details"];
    end

    A ~~~ N
    B ~~~ O

Hybrid Search & Re-ranking RAG Approach

To ensure accurate and relevant information retrieval for the Q&A module, this project employs a sophisticated hybrid search strategy combined with a re-ranking step. When a user poses a question, the system first splits the uploaded RFP document into manageable text chunks. These chunks are then indexed using two distinct retrieval methods:

BM25 Retriever: This is a keyword-based sparse retriever that excels at finding documents containing the exact terms present in the user's query. It's highly effective for matching specific jargon, names, or codes.
Semantic Retriever: This utilizes dense vector embeddings (e.g., from Sentence Transformers) to find chunks that are semantically similar to the user's query, even if they don't use the exact same keywords. This helps capture conceptual matches and answer questions phrased differently from the document text.

The results from both BM25 and semantic search are combined by an Ensemble Retriever, which weighs their outputs to leverage the strengths of both keyword and meaning-based matching. This initial set of retrieved documents is then passed to a Cohere Rerank model. The reranker takes these candidates and re-evaluates their relevance to the original query, providing a more refined and accurately ordered list of the most pertinent document chunks. This multi-stage process, culminating in re-ranking, significantly enhances the quality of context provided to the LLM for generating the final answer, leading to more precise and reliable responses.

Tech Stack

Category	Technology/Library	Description
Frontend	React.js	Core JavaScript library for building the user interface.
	JavaScript (JSX)	Language for React components.
	Tailwind CSS	Utility-first CSS framework for styling.
	`react-icons`	Library for incorporating icons.
Backend	Python	Core language for backend logic.
	FastAPI	Modern, fast web framework for building APIs.
	LangChain	Framework for developing applications powered by LLMs.
	Ollama	Platform for running LLMs locally (e.g., Mistral, Phi3:mini).
	Pydantic	Data validation and settings management using Python type annotations.
	ChromaDB	AI-native open-source embedding database for RAG.
	Hugging Face Transformers	For sentence embeddings (e.g., `all-mpnet-base-v2`).
	Cohere Rerank	For re-ranking retrieved documents to improve relevance (via LangChain).
	Uvicorn	ASGI server for running FastAPI applications.
	`python-dotenv`	For managing environment variables.

Setup & Installation

Prerequisites:
- Python 3.9+
- Node.js & npm/yarn
- Ollama installed and running (Ollama Download)

Backend Setup:

# Clone the repository
# cd backend
# Create a virtual environment (e.g., python -m venv venv)
# Activate the virtual environment (e.g., source venv/bin/activate or venv\Scripts\activate)
# pip install -r requirements.txt
# ollama pull mistral  (or your primary RAG model)
# ollama pull phi3:mini (or your chosen analysis model)
# Create a .env file and add your COHERE_API_KEY if using re-ranking

Frontend Setup:

# cd frontend
# npm install (or yarn install)

Start the Backend Server:

# Navigate to the backend directory
# uvicorn rag:app --reload --port 8000

Start the Frontend Development Server:

# Navigate to the frontend directory
# npm start (or yarn start)

Open your browser and navigate to http://localhost:3000.
Upload an RFP document.
Use the "Query Interface" to ask questions about the document.
Use the "Analyze RFP" button to extract structured information.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
backend		backend
frontend		frontend
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Powered RFP Analysis & Q&A System

Project Overview

Workflow Overview

Hybrid Search & Re-ranking RAG Approach

Tech Stack

Setup & Installation

About

Uh oh!

Releases

Packages

Languages

kedarvartak/rfp-analyzer

Folders and files

Latest commit

History

Repository files navigation

AI-Powered RFP Analysis & Q&A System

Project Overview

Workflow Overview

Hybrid Search & Re-ranking RAG Approach

Tech Stack

Setup & Installation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages