| title | emoji | colorFrom | colorTo | sdk | pinned | short_description | tags | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
Chef Reachy |
π¨βπ³ |
red |
blue |
static |
false |
Voice-activated food inventory assistant with Claude Agent SDK |
|
A voice-activated food inventory assistant for Reachy Mini using Claude Agent SDK, local Whisper, and Claude Vision API.
- Voice activation - Say "Claude" to start a conversation
- Natural conversation - Multi-turn dialogue with stateful context
- Multi-angle capture - Captures 3 different angles of food packaging
- Claude Vision analysis - Extracts product name and expiration date from images
- Inventory management - Tracks food items with expiration dates
- Local speech-to-text - Whisper running on-device (no cloud STT costs)
- Text-to-speech - Kokoro-82M for natural voice responses
- WebSocket streaming - Real-time updates to web interface
- Persistent storage - Saves inventory to ~/.chef_reachy/inventory.json
Audio (Reachy mic) β Local Whisper β Wake word "Claude" β Claude Agent SDK
β
[Tools: scan_food, get_inventory, remove_item]
β
Camera captures β Claude Vision processes
β
Inventory DB β Response β Kokoro-82M TTS β Audio out
- Whisper STT (
chef_reachy/audio/whisper.py) - Local speech-to-text using faster-whisper - Claude Agent (
chef_reachy/agent/) - Claude Agent SDK with custom tools for inventory management - Inventory (
chef_reachy/inventory/) - Persistent food item tracking - Kokoro TTS (
chef_reachy/audio/tts.py) - Text-to-speech for voice responses
- Python 3.12+
- Reachy Mini robot
- Anthropic API key (for Claude)
- Clone the repository:
git clone https://github.com/yourusername/chef_reachy.git
cd chef_reachy- Install dependencies with uv:
uv sync- Set up environment variables:
cp .env.example .env
# Edit .env and add your Anthropic API key- Run the app:
uv run reachy-mini-apps run- Say "Claude" to activate the assistant
- Ask questions or give commands naturally:
- "What's in my inventory?"
- "Add this item" (then show the food packaging to the camera)
- "Remove the milk from inventory"
- "Clear the inventory"
You: "Claude, what do I have in my fridge?"
Reachy: "You currently have 3 items: Organic Milk expiring on February 15th,
Greek Yogurt expiring on February 20th, and Cheddar Cheese expiring
on March 1st."
You: "Add this item"
Reachy: "I'll scan this item for you. Please hold it steady while I take
pictures from different angles."
[Captures 3 images]
"Added Organic Eggs expiring on February 25th to your inventory."
Edit chef_reachy/main.py to change Whisper model size:
whisper_config = WhisperConfig(
model_size="base", # Options: tiny, base, small, medium, large
device="cpu",
compute_type="int8" # Options: int8, float16, float32
)Smaller models are faster but less accurate. Recommended:
- tiny - Fastest, good for simple speech (~1GB RAM)
- base - Balanced speed/accuracy (~1.5GB RAM) [Default]
- small - Better accuracy (~2GB RAM)
Edit chef_reachy/agent/config.py:
@dataclass
class AgentConfig:
model: str = "claude-3-5-sonnet-20241022" # Claude model
max_tokens: int = 1024
temperature: float = 0.7The assistant has these tools:
-
scan_food_item - Capture and analyze food packaging
- Takes 3 photos at different angles (3 seconds apart)
- Sends images to Claude Vision API
- Extracts product name and expiration date
- Adds item to inventory
-
get_inventory - Retrieve all items
- Returns product names, expiration dates, and expired status
-
remove_item - Remove item by name
- Removes first matching item from inventory
-
clear_inventory - Clear all items
- Empties the entire inventory
-
Latency: ~3-5 seconds per interaction
- Whisper transcription: ~1-2s
- Claude API: ~2-3s
- TTS: ~500ms
-
Memory: ~2-3GB total
- Whisper base: ~1.5GB
- Kokoro TTS: ~100MB
- Application: ~500MB
-
Cost: ~$5-10/month for typical use
- Claude API: ~$0.003 per request (text)
- Claude Vision: ~$0.005 per image analysis
- No STT costs (local Whisper)
- No TTS costs (local Kokoro)
# Install dev dependencies
uv sync --group dev
# Run type checking
uv run pyright
# Run linting
uv run ruff check
uv run ruff formatchef_reachy/
βββ agent/ # Claude Agent SDK integration
β βββ config.py # Agent configuration
β βββ tools.py # Custom tools (scan, inventory, etc.)
βββ audio/ # Speech processing
β βββ whisper.py # Whisper STT
β βββ tts.py # Kokoro TTS
βββ inventory/ # Inventory management
β βββ models.py # FoodItem model
β βββ manager.py # InventoryManager
βββ static/ # Web UI assets
βββ main.py # Main application
Create a .env file with your API key:
ANTHROPIC_API_KEY=sk-ant-your-key-hereThe first run downloads the Whisper model (~300MB for base). Ensure you have:
- Internet connection
- Sufficient disk space (~1GB)
- Write access to
~/.cache/huggingface/
Check that:
- Reachy's microphone is working
media_backend="default"is set in main.py- No other app is using the microphone
This project uses third-party models with their own licenses:
- Whisper - MIT License (OpenAI)
- Kokoro-82M - Apache 2.0 License
- Claude API - Anthropic Terms of Service
- faster-whisper - MIT License
Built with: