AskBookie API

title	emoji	colorFrom	colorTo	sdk	pinned	app_port
AskBookie API	🔥	blue	purple	docker	false	7860

AskBookie API

Production-grade retrieval-augmented generation service for document-based question answering. The system operates on pre-vectorised document clusters stored in Qdrant, with semantic retrieval feeding into instruction-tuned language model inference.

Base URL: https://pmmdot-askbookie.hf.space
Interactive Documentation: /docs (Swagger UI) | /redoc (ReDoc)

Technical Stack

Component	Technology
Web Framework	FastAPI
Vector Database	Qdrant Cloud
Embedding Model	HuggingFace `gte-modernbert-base`
Language Models	Gemini 3 Flash/Pro, GPT-4o-mini, Claude-3-Haiku
RAG Orchestration	LangChain
Metadata Storage	MongoDB Atlas
PDF Processing	PyPDFLoader

Available Models

Model ID	Name	Description
1	Gemini-3-flash	Gemini Primary API Key
2	Gemini-3-flash (Back-up)	Gemini Secondary API Key
3	Gemini-3-Pro	Gemini Primary API Key
4	GPT-4o-mini	DuckDuckGo (Free)
5	Claude-3-Haiku	DuckDuckGo (Free)

Authentication

All endpoints except /health and / require HMAC-SHA256 request signing.

Required Headers

Header	Description
`X-API-Key-Id`	Unique identifier for your API key
`X-API-Timestamp`	Current Unix timestamp (seconds)
`X-API-Signature`	HMAC-SHA256 signature of the request

Signature Construction

The signature message follows the format:

{timestamp}\n{HTTP_METHOD}\n{path}

JavaScript Implementation:

async function generateAuthHeaders(method, path, keyId, secret) {
    const timestamp = Math.floor(Date.now() / 1000).toString();
    const message = `${timestamp}\n${method.toUpperCase()}\n${path}`;
    
    const encoder = new TextEncoder();
    const key = await crypto.subtle.importKey(
        'raw', encoder.encode(secret),
        { name: 'HMAC', hash: 'SHA-256' }, false, ['sign']
    );
    const sig = await crypto.subtle.sign('HMAC', key, encoder.encode(message));
    const signature = Array.from(new Uint8Array(sig))
        .map(b => b.toString(16).padStart(2, '0')).join('');
    
    return {
        'X-API-Key-Id': keyId,
        'X-API-Timestamp': timestamp,
        'X-API-Signature': signature
    };
}

Security Constraints

Timestamp tolerance: 300 seconds (5 minutes)
Failed auth lockout: 5 attempts per IP (5-minute window)
Constant-time signature comparison (timing attack prevention)

Rate Limits

Endpoint	Limit	Window
`/ask`	30 requests	60 seconds
`/upload`	2 requests	60 seconds
All other endpoints	50 requests	60 seconds

When rate limited, responses include Retry-After: 60 header.

Core Endpoints

POST /ask

Query documents using semantic retrieval + LLM synthesis.

Important

Two query modes exist:

Standard Mode: Query pre-indexed university materials using subject + unit
Custom Upload Mode: Query user-uploaded PDFs using cluster (returned from /upload)

Request Schema

Field	Type	Required	Constraints	Description
`query`	string	Yes	1-1000 chars	Natural language question
`subject`	string	Conditional	1-100 chars, alphanumeric + `_-`	Subject collection (e.g., `evs`, `physics`)
`unit`	integer	Conditional	1-4	Unit number within the subject
`cluster`	string	Conditional	max 100 chars	Temp cluster from `/upload` response
`context_limit`	integer	No	1-20, default 5	Number of context chunks

Warning

Mutual Exclusivity: Either provide cluster OR provide BOTH subject AND unit. Never mix them.

Example 1: Standard Mode (Pre-indexed Materials)

POST /ask HTTP/1.1
Content-Type: application/json

{
    "query": "What are the different types of ecosystems?",
    "subject": "evs",
    "unit": 2,
    "context_limit": 5
}

Example 2: Custom Upload Mode (User PDF)

POST /ask HTTP/1.1
Content-Type: application/json

{
    "query": "Summarize the main findings",
    "cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2"
}

Response (200 OK)

{
    "answer": "Ecosystems are classified into terrestrial and aquatic...",
    "sources": [
        "evs_chapter3.pdf: Slide 12",
        "evs_chapter3.pdf: Slide 15"
    ],
    "collection": "askbookie_evs_unit-2",
    "request_id": "a1b2c3d4e5f6g7h8"
}

Field	Description
`answer`	LLM-generated response (Markdown formatted, LaTeX supported)
`sources`	List of source references: `"filename: Slide N"`
`collection`	The Qdrant collection queried
`request_id`	Unique identifier for debugging

POST /upload

Upload a PDF document for custom RAG queries. Processing is asynchronous.

Request

POST /upload HTTP/1.1
Content-Type: multipart/form-data

file: [binary PDF data]

Field	Type	Required	Constraints
`file`	binary	Yes	PDF only, max 10MB, must start with `%PDF` magic bytes

Response (200 OK)

{
    "job_id": "a1b2c3d4e5f6g7h8i9j0k1l2",
    "status": "queued",
    "filename": "my_notes.pdf",
    "size": 2457600,
    "temp_cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2"
}

Important

Critical fields for frontend:

job_id: Use this to poll /jobs/{job_id} for processing status
temp_cluster: SAVE THIS! Use it in /ask requests to query this PDF

Processing Pipeline

Validation: MIME type, magic bytes, size check
Chunking: Split by page boundaries with context overlap
Embedding: Vectorize using gte-modernbert-base
Storage: Upsert to Qdrant under temp_cluster collection

Job Status Values

Status	Description
`queued`	Accepted, awaiting processing
`processing`	Currently being chunked/embedded
`done`	Ready for queries
`failed`	Check `error` field for details

GET /jobs/{job_id}

Poll the status of a PDF processing job.

{
    "job_id": "a1b2c3d4e5f6g7h8i9j0k1l2",
    "status": "done",
    "temp_cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2",
    "filename": "my_notes.pdf",
    "error": null
}

GET /jobs

List all jobs for the authenticated API key.

Frontend Integration Guide

Important

This section provides implementation guidance for frontend developers.

Chat Session State Model

interface ChatSession {
    // User selection (standard mode)
    subject: string | null;      // e.g., "evs", "physics"
    unit: number | null;         // 1-4
    
    // Custom upload (custom mode)  
    tempCluster: string | null;  // From /upload response
    uploadJobId: string | null;  // For status polling
    
    // Mode lock
    isCustomMode: boolean;       // Once PDF uploaded, lock to custom mode
}

Flow 1: Standard Query (Pre-indexed Materials)

┌─────────────────────────────────────────────────────────┐
│  User selects Subject: [EVS ▼] and Unit: [2 ▼]          │
│  ─────────────────────────────────────────────────────  │
│  User types: "What are ecosystem types?"                │
│                                                         │
│  → POST /ask { query, subject: "evs", unit: 2 }         │
│  ← Response with answer + sources                       │
└─────────────────────────────────────────────────────────┘

Flow 2: Custom PDF Upload

┌─────────────────────────────────────────────────────────┐
│  Step 1: User uploads PDF                               │
│  ─────────────────────────────────────────────────────  │
│  → POST /upload (multipart/form-data)                   │
│  ← { job_id, temp_cluster, status: "queued" }           │
│                                                         │
│  ⚠️  SAVE: temp_cluster = "temp_abc123..."              │
│  ⚠️  LOCK: subject/unit dropdowns (disable them)        │
└─────────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────────┐
│  Step 2: Poll for completion                            │
│  ─────────────────────────────────────────────────────  │
│  Loop every 2-3 seconds:                                │
│  → GET /jobs/{job_id}                                   │
│  ← { status: "processing" | "done" | "failed" }         │
│                                                         │
│  When status === "done": Enable chat input              │
│  When status === "failed": Show error, unlock dropdowns │
└─────────────────────────────────────────────────────────┘
                        ↓
┌─────────────────────────────────────────────────────────┐
│  Step 3: Query the uploaded PDF                         │
│  ─────────────────────────────────────────────────────  │
│  User types: "Summarize the main points"                │
│                                                         │
│  → POST /ask { query, cluster: "temp_abc123..." }       │
│  ← Response with answer + sources from their PDF        │
│                                                         │
│  ⚠️  Keep using the same temp_cluster for all queries   │
│      in this chat session                               │
└─────────────────────────────────────────────────────────┘

UI State Logic

// When user uploads a PDF
async function handlePdfUpload(file: File) {
    const formData = new FormData();
    formData.append('file', file);
    
    const response = await fetch('/upload', {
        method: 'POST',
        headers: generateAuthHeaders('POST', '/upload'),
        body: formData
    });
    const data = await response.json();
    
    // CRITICAL: Store these values in session state
    session.tempCluster = data.temp_cluster;  // ← SAVE THIS
    session.uploadJobId = data.job_id;
    session.isCustomMode = true;              // ← LOCK MODE
    
    // Disable subject/unit dropdowns in UI
    disableSubjectUnitSelectors();
    
    // Start polling
    pollJobStatus(data.job_id);
}

// When sending a query
async function sendQuery(query: string) {
    let payload;
    
    if (session.isCustomMode && session.tempCluster) {
        // Custom mode: use cluster
        payload = {
            query: query,
            cluster: session.tempCluster  // ← USE STORED VALUE
        };
    } else {
        // Standard mode: use subject + unit
        payload = {
            query: query,
            subject: session.subject,
            unit: session.unit
        };
    }
    
    const response = await fetch('/ask', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            ...generateAuthHeaders('POST', '/ask')
        },
        body: JSON.stringify(payload)
    });
    
    return await response.json();
}

Subject/Unit Locking Rules

Scenario	Subject Dropdown	Unit Dropdown	Upload Button
Fresh chat session	✅ Enabled	✅ Enabled	✅ Enabled
After selecting subject/unit	✅ Enabled (can change)	✅ Enabled	✅ Enabled
After uploading PDF	❌ Disabled	❌ Disabled	❌ Disabled
After upload fails	✅ Re-enabled	✅ Re-enabled	✅ Re-enabled
New chat started	✅ Enabled (reset)	✅ Enabled	✅ Enabled

Caution

Once a user uploads a PDF in a chat session, ALL subsequent queries in that session MUST use the cluster parameter, not subject/unit. The temp_cluster is tied to their uploaded document.

Available Subjects & Units

Subject	Units Available	Collection Pattern
`evs`	1, 2, 3, 4	`askbookie_evs_unit-{N}`
`physics`	1, 2, 3, 4	`askbookie_physics_unit-{N}`
other subjects	1-4	`askbookie_{subject}_unit-{N}`

Answer Formatting

Answers are returned in Markdown with LaTeX support:

Inline math: $E = mc^2$
Block math: $$\int_0^1 x^2 dx$$

Use a Markdown renderer with KaTeX/MathJax integration.

System Endpoints

GET /health

Service health check. No authentication required.

{
    "status": "healthy",
    "uptime_hours": 48.5,
    "current_model": {
        "model_id": 1,
        "name": "Gemini-3-flash",
        "description": "Gemini Primary API Key"
    }
}

GET /

Returns dashboard HTML or service metadata.

Admin Endpoints

Note

All admin endpoints require the admin API key.

GET /history

Paginated query history across all users.

GET /admin/keys

List all API keys with status.

POST /admin/keys/{key_id}/enable

Re-enable a disabled key.

POST /admin/keys/{key_id}/disable

Disable an API key (cannot disable admin).

GET /admin/models/current

Get current active model.

POST /admin/models/switch

{ "model_id": 2 }

Switch to a different LLM backend (1-5).

Error Handling

All errors return:

{ "detail": "Error description" }

Code	Meaning
400	Bad Request - Missing/invalid parameters
401	Unauthorized - Invalid signature or expired key
403	Forbidden - Admin endpoint accessed with non-admin key
404	Not Found - Job doesn't exist or wrong owner
413	Payload Too Large - PDF > 10MB or JSON > 16KB
429	Rate Limited - See `Retry-After` header
500	Internal Error - RAG pipeline failure

Special 429 Cases

Detail Message	Cause	Frontend Action
`"Rate limit exceeded"`	Too many requests	Wait 60s, show countdown
`"Too many concurrent uploads"`	3+ uploads in progress	Wait for pending jobs
`"LLM quota exhausted"`	Model API limit hit	Retry in 1hr or notify user
`"Too many failed attempts"`	Auth lockout	Wait 5 minutes

Quick Reference: /ask Request Bodies

Standard Mode:

{
    "query": "Your question here",
    "subject": "evs",
    "unit": 2
}

Custom Upload Mode:

{
    "query": "Your question here", 
    "cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2"
}

❌ Invalid (mixing modes):

{
    "query": "question",
    "subject": "evs",
    "cluster": "temp_..."
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
assets		assets
src		src
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

dotpmm/askbookie

Folders and files

Latest commit

History

Repository files navigation

AskBookie API

Table of Contents

Technical Stack

Available Models

Authentication

Required Headers

Signature Construction

Security Constraints

Rate Limits

Core Endpoints

POST /ask

Request Schema

Example 1: Standard Mode (Pre-indexed Materials)

Example 2: Custom Upload Mode (User PDF)

Response (200 OK)

POST /upload

Request

Response (200 OK)

Processing Pipeline

Job Status Values

GET /jobs/{job_id}

GET /jobs

Frontend Integration Guide

Chat Session State Model

Flow 1: Standard Query (Pre-indexed Materials)

Flow 2: Custom PDF Upload

UI State Logic

Subject/Unit Locking Rules

Available Subjects & Units

Answer Formatting

System Endpoints

GET /health

GET /

Admin Endpoints

GET /history

GET /admin/keys

POST /admin/keys/{key_id}/enable

POST /admin/keys/{key_id}/disable

GET /admin/models/current

POST /admin/models/switch

Error Handling

Special 429 Cases

Quick Reference: /ask Request Bodies

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages