Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright 2025 Stacklok, Inc.
Copyright 2023 Stacklok, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
73 changes: 73 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
.PHONY: all setup test lint format check clean

# Default target
all: setup lint test

# Setup development environment
setup:
python -m pip install --upgrade pip
pip install -e ".[dev]"

# Run tests
test:
pytest tests/ -v

# Run all linting and type checking
lint: format-check lint-check type-check

# Format code
format:
black .
isort .

# Check formatting
format-check:
black --check .
isort --check .

# Run linting
lint-check:
ruff check .

# Run type checking
type-check:
mypy src/

# Clean up
clean:
rm -rf build/
rm -rf dist/
rm -rf *.egg-info
rm -rf .pytest_cache
rm -rf .mypy_cache
rm -rf .ruff_cache
find . -type d -name __pycache__ -exec rm -rf {} +
find . -type f -name "*.pyc" -delete

# Build package
build: clean
python -m build

# Install package locally
install:
pip install -e .

# Install development dependencies
install-dev:
pip install -e ".[dev]"

# Help target
help:
@echo "Available targets:"
@echo " all : Run setup, lint, and test"
@echo " setup : Set up development environment"
@echo " test : Run tests"
@echo " lint : Run all code quality checks"
@echo " format : Format code with black and isort"
@echo " format-check : Check code formatting"
@echo " lint-check : Run ruff linter"
@echo " type-check : Run mypy type checker"
@echo " clean : Clean up build artifacts"
@echo " build : Build package"
@echo " install : Install package locally"
@echo " install-dev : Install package with development dependencies"
157 changes: 34 additions & 123 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,16 @@
# Mock LLM Server

[![CI](https://github.com/lukehinds/mockllm/actions/workflows/ci.yml/badge.svg)](https://github.com/lukehinds/mockllm/actions/workflows/ci.yml)
[![CI](https://github.com/stacklok/mockllm/actions/workflows/ci.yml/badge.svg)](https://github.com/stacklok/mockllm/actions/workflows/ci.yml)
[![PyPI version](https://badge.fury.io/py/mockllm.svg)](https://badge.fury.io/py/mockllm)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)


A FastAPI-based mock LLM server that mimics OpenAI and Anthropic API formats. Instead of calling actual language models,
it uses predefined responses from a YAML configuration file.

This is made for when you want a deterministic response for testing or development purposes.

Check out the [CodeGate](https://github.com/stacklok/codegate) when you're done here!

## Project Structure

```
mockllm/
├── src/
│ └── mockllm/
│ ├── __init__.py
│ ├── config.py # Response configuration handling
│ ├── models.py # Pydantic models for API
│ └── server.py # FastAPI server implementation
├── tests/
│ └── test_server.py # Test suite
├── example.responses.yml # Example response configuration
├── LICENSE # MIT License
├── MANIFEST.in # Package manifest
├── README.md # This file
├── pyproject.toml # Project configuration
└── requirements.txt # Dependencies
```
Check out the [CodeGate](https://github.com/stacklok/codegate) project when you're done here!

## Features

Expand All @@ -53,7 +34,7 @@ pip install mockllm

1. Clone the repository:
```bash
git clone https://github.com/lukehinds/mockllm.git
git clone https://github.com/stacklok/mockllm.git
cd mockllm
```

Expand Down Expand Up @@ -168,114 +149,49 @@ defaults:

The server automatically detects changes to `responses.yml` and reloads the configuration without requiring a restart.

## API Format

### OpenAI Format

#### Request Format

```json
{
"model": "mock-llm",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"temperature": 0.7,
"max_tokens": 150,
"stream": false
}
```

#### Response Format

Regular response:
```json
{
"id": "mock-123",
"object": "chat.completion",
"created": 1700000000,
"model": "mock-llm",
"choices": [
{
"message": {
"role": "assistant",
"content": "The sky is blue during a clear day due to a phenomenon called Rayleigh scattering."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 5,
"total_tokens": 15
}
}
```

Streaming response (Server-Sent Events format):
```
data: {"id":"mock-123","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{"role":"assistant"},"index":0}]}
## Development

data: {"id":"mock-124","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{"content":"T"},"index":0}]}
The project includes a Makefile to help with common development tasks:

data: {"id":"mock-125","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{"content":"h"},"index":0}]}
```bash
# Set up development environment
make setup

... (character by character)
# Run all checks (setup, lint, test)
make all

data: {"id":"mock-999","object":"chat.completion.chunk","created":1700000000,"model":"mock-llm","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}
# Run tests
make test

data: [DONE]
```
# Format code
make format

### Anthropic Format
# Run all linting and type checking
make lint

#### Request Format

```json
{
"model": "claude-3-sonnet-20240229",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"max_tokens": 1024,
"stream": false
}
```
# Clean up build artifacts
make clean

#### Response Format

Regular response:
```json
{
"id": "mock-123",
"type": "message",
"role": "assistant",
"model": "claude-3-sonnet-20240229",
"content": [
{
"type": "text",
"text": "The sky is blue during a clear day due to a phenomenon called Rayleigh scattering."
}
],
"usage": {
"input_tokens": 10,
"output_tokens": 5,
"total_tokens": 15
}
}
# See all available commands
make help
```

Streaming response (Server-Sent Events format):
```
data: {"type":"message_delta","id":"mock-123","delta":{"type":"content_block_delta","index":0,"delta":{"text":"T"}}}
### Development Commands

data: {"type":"message_delta","id":"mock-123","delta":{"type":"content_block_delta","index":0,"delta":{"text":"h"}}}
- `make setup`: Install all development dependencies
- `make test`: Run the test suite
- `make format`: Format code with black and isort
- `make lint`: Run all code quality checks (format, lint, type)
- `make build`: Build the package
- `make clean`: Remove build artifacts and cache files
- `make install-dev`: Install package with development dependencies

... (character by character)
For more details on available commands, run `make help`.

data: [DONE]
```
## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
=======
## Development

### Running Tests
Expand All @@ -301,8 +217,6 @@ ruff check .

## Error Handling

The server includes comprehensive error handling:

- Invalid requests return 400 status codes with descriptive messages
- Server errors return 500 status codes with error details
- All errors are logged using JSON format
Expand All @@ -319,6 +233,3 @@ The server uses JSON-formatted logging for:

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
2 changes: 1 addition & 1 deletion main.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
from src.mockllm.server import app

if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000, reload=True)
uvicorn.run(app, host="0.0.0.0", port=8000, reload=True)
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ dynamic = ["version"]
description = "A mock server that mimics OpenAI and Anthropic API formats for testing"
readme = "README.md"
requires-python = ">=3.8"
license = {text = "Apache License (2.0)"}
license = {text = "Apache-2.0"}
keywords = ["mock", "llm", "openai", "anthropic", "testing"]
authors = [
{name = "Luke Hinds", email = "lhinds@redhat.com"}
{name = "Luke Hinds", email = "luke@stacklok.com"}
]
classifiers = [
"Development Status :: 4 - Beta",
Expand Down Expand Up @@ -77,3 +77,4 @@ line-length = 88
target-version = "py38"
select = ["E", "F", "B", "I"]
ignore = []

2 changes: 1 addition & 1 deletion src/mockllm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
Mock LLM Server - You will do what I tell you!
"""

__version__ = "0.1.0"
__version__ = "0.1.0"
17 changes: 17 additions & 0 deletions src/mockllm/_version.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# file generated by setuptools_scm
# don't change, don't track in version control
TYPE_CHECKING = False
if TYPE_CHECKING:
from typing import Tuple, Union

VERSION_TUPLE = Tuple[Union[int, str], ...]
else:
VERSION_TUPLE = object

version: str
__version__: str
__version_tuple__: VERSION_TUPLE
version_tuple: VERSION_TUPLE

__version__ = version = "0.1.dev13+gb4dbfaf"
__version_tuple__ = version_tuple = (0, 1, "dev13", "gb4dbfaf")
24 changes: 14 additions & 10 deletions src/mockllm/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
logging.basicConfig(level=logging.INFO, handlers=[log_handler])
logger = logging.getLogger(__name__)


class ResponseConfig:
"""Handles loading and managing response configurations from YAML."""

Expand All @@ -26,34 +27,37 @@ def load_responses(self) -> None:
try:
current_mtime = Path(self.yaml_path).stat().st_mtime
if current_mtime > self.last_modified:
with open(self.yaml_path, 'r') as f:
with open(self.yaml_path, "r") as f:
data = yaml.safe_load(f)
self.responses = data.get('responses', {})
self.default_response = data.get('defaults', {}).get(
'unknown_response', self.default_response
self.responses = data.get("responses", {})
self.default_response = data.get("defaults", {}).get(
"unknown_response", self.default_response
)
self.last_modified = current_mtime
logger.info(f"Loaded {len(self.responses)} responses from {self.yaml_path}")
logger.info(
f"Loaded {len(self.responses)} responses from {self.yaml_path}"
)
except Exception as e:
logger.error(f"Error loading responses: {str(e)}")
raise HTTPException(
status_code=500,
detail="Failed to load response configuration"
status_code=500, detail="Failed to load response configuration"
)

def get_response(self, prompt: str) -> str:
"""Get response for a given prompt."""
self.load_responses() # Check for updates
return self.responses.get(prompt.lower().strip(), self.default_response)

def get_streaming_response(self, prompt: str, chunk_size: Optional[int] = None) -> str:
def get_streaming_response(
self, prompt: str, chunk_size: Optional[int] = None
) -> str:
"""Generator that yields response content character by character or in chunks."""
response = self.get_response(prompt)
if chunk_size:
# Yield response in chunks
for i in range(0, len(response), chunk_size):
yield response[i:i + chunk_size]
yield response[i : i + chunk_size]
else:
# Yield response character by character
for char in response:
yield char
yield char
Loading
Loading