MemU is an agentic memory framework for LLM and AI agent backends. It receives multi-modal inputs, extracts them into memory items, and then organizes and summarizes these items into structured memory files.
Unlike traditional RAG systems that rely solely on embedding-based search, MemU supports non-embedding retrieval through direct file reading. The LLM comprehends natural language memory files directly, enabling deep search by progressively tracking from categories β items β original resources.
MemU is commonly used in scenarios where memory matters:
- Capture user behavior and context to build user profiles
- Track agent successes and failures for self-improvement
- Integrate multi-source data for knowledge management
- Enable personalized recommendations with deep memory
MemU offers several convenient ways to get started right away:
- Full backend for local deployment π memU-server: https://github.com/NevaMind-AI/memU-server
- Local visual interface π memU-ui: https://github.com/NevaMind-AI/memU-ui
- One call = response + memory π memU Response API: https://memu.pro/docs#responseapi
- Try MemU cloud version instantly π https://app.memu.so/quick-start
Star MemU to get notified about new releases and join our growing community of AI developers building intelligent agents with persistent memory capabilities.
π¬ Join our Discord community: https://discord.gg/memu
The current version initializes the memorize and retrieve workflows with the new 3-layer architecture. More features are coming soon as we continue to expand MemU's capabilities.
- Multi-modal enhancements β Support for images, audio, and video
- User model and user context store
- Intention β Higher-level decision-making and goal management
- Multi-client support β Switch between OpenAI, Deepseek, Gemini, etc.
- Data persistence expansion β Support for Postgres, S3, DynamoDB
- Benchmark tools β Test agent performance and memory efficiency
- β¦β¦
- memU-ui β The web frontend for MemU, providing developers with an intuitive and visual interface
- memU-server β Powers memU-ui with reliable data support, ensuring efficient reading, writing, and maintenance of agent memories
Most memory systems in current LLM pipelines rely heavily on explicit modeling, requiring manual definition and annotation of memory categories. This limits AIβs ability to truly understand memory and makes it difficult to support diverse usage scenarios.
MemU offers a flexible and robust alternative, inspired by hierarchical storage architecture in computer systems. It progressively transforms heterogeneous input data into queryable and interpretable textual memory.
Its core architecture consists of three layers: Resource Layer β Memory Item Layer β MemoryCategory Layer.
- Resource Layer: Multimodal raw data warehouse
- Memory Item Layer: Discrete extracted memory units
- MemoryCategory Layer: Aggregated textual memory units
Key Features:
- Full Traceability: Track from raw data β items β documents and back
- Memory Lifecycle: Memorization β Retrieval β Self-evolution
- Two Retrieval Methods:
- RAG-based: Fast embedding vector search
- LLM-based: Direct file reading with deep semantic understanding
- Self-Evolving: Adapts memory structure based on usage patterns
| memU | memU-server | memU-ui | |
|---|---|---|---|
| Positioning | Core algorithm engine | Memory data backend service | Front-end dashboard |
| Key Features | - Core algorithms - Memory extraction - Multi-strategy retrieval β¦ |
- Memory CRUD - Retrieve record tracking - Token usage & billing tracking - User system - RBAC permission system - Security boundary controls β¦ |
- Front-end interface - Visual memory viewer - User management UI - Data retrieval UI - Easy self-hosting experience β¦ |
| Best For | Developers/teams who want to embed AI memory algorithms into their product | Teams that want to self-host a memory backend (internal tools, research, enterprise setups) | Developers/teams looking for a ready-to-use memory console |
| Usage | Core algorithms can be used standalone or integrated into server | Self-hostable; works together with memU | Self-hostable; integrates with memU |
Summary: memU, memU-server, and memU-ui together form a flexible memory ecosystem for LLMs and AI agents.
pip install memu-py
β οΈ Important: Ensure you have Python 3.14+
β οΈ Important: Replace"your-openai-api-key"with your actual OpenAI API key to use the service.
from memu.app import MemoryService
import os
async def main():
api_key = "your-openai-api-key"
file_path = os.path.abspath("path/to/memU/tests/example/example_conversation.json")
# Initialize service with RAG method
service_rag = MemoryService(
llm_config={"api_key": api_key},
embedding_config={"api_key": api_key},
retrieve_config={"method": "rag"}
)
# Memorize
memory = await service_rag.memorize(resource_url=file_path, modality="conversation")
for cat in memory.get('categories', []):
print(f" - {cat.get('name')}: {(cat.get('summary') or '')[:80]}...")
queries = [
{"role": "user", "content": {"text": "Tell me about preferences"}},
{"role": "user", "content": {"text": "What are their habits?"}}
]
# RAG-based retrieval
print("\n[RETRIEVED - RAG]")
result_rag = await service_rag.retrieve(queries=queries)
for item in result_rag.get('items', [])[:3]:
print(f" - [{item.get('memory_type')}] {item.get('summary', '')[:100]}...")
# Initialize service with LLM method (reuse same memory store)
service_llm = MemoryService(
llm_config={"api_key": api_key},
embedding_config={"api_key": api_key},
retrieve_config={"method": "llm"}
)
service_llm.store = service_rag.store # Reuse memory store
# LLM-based retrieval
print("\n[RETRIEVED - LLM]")
result_llm = await service_llm.retrieve(queries=queries)
for item in result_llm.get('items', [])[:3]:
print(f" - [{item.get('memory_type')}] {item.get('summary', '')[:100]}...")
if __name__ == "__main__":
import asyncio
asyncio.run(main())RAG-based (method="rag"): Fast embedding vector search for large-scale data
LLM-based (method="llm"): Deep semantic understanding through direct file reading
Both support:
- Context-aware rewriting: Resolves pronouns using conversation history
- Progressive search: Categories β Items β Resources
- Next-step suggestions: Iterative multi-turn retrieval
MemU provides practical examples demonstrating different memory extraction and organization scenarios. Each example showcases a specific use case with real-world applications.
Extract and organize memory from multi-turn conversations. Perfect for:
- Personal AI assistants that remember user preferences and history
- Customer support bots maintaining conversation context
- Social chatbots building user profiles over time
Example: Process multiple conversation files and automatically categorize memories into personal_info, preferences, work_life, relationships, etc.
export OPENAI_API_KEY=your_api_key
python examples/example_1_conversation_memory.pyWhat it does:
- Processes conversation JSON files
- Extracts memory items (preferences, habits, opinions)
- Organizes into structured categories
- Generates readable markdown files for each category
Extract skills and lessons learned from agent execution logs. Ideal for:
- DevOps teams learning from deployment experiences
- Agent systems improving through iterative execution
- Knowledge management from operational logs
Example: Process deployment logs incrementally, learning from each attempt to build a comprehensive skill guide.
export OPENAI_API_KEY=your_api_key
python examples/example_2_skill_extraction.pyWhat it does:
- Processes agent logs sequentially
- Extracts actions, outcomes, and lessons learned
- Demonstrates incremental learning (memory evolves with each file)
- Generates evolving skill guides (log_1.md β log_2.md β log_3.md β skill.md)
Key Feature: Shows MemU's core strength - continuous memory updates. Each file updates existing memory, and category summaries evolve progressively.
Process diverse content types (documents, images, videos) into unified memory. Great for:
- Documentation systems processing mixed media
- Learning platforms combining text and visual content
- Research tools analyzing multimodal data
Example: Process technical documents and architecture diagrams together, creating unified memory categories.
export OPENAI_API_KEY=your_api_key
python examples/example_3_multimodal_memory.pyWhat it does:
- Processes multiple modalities (text documents, images)
- Extracts memory from different content types
- Unifies memories into cross-modal categories
- Creates organized documentation (technical_documentation, architecture_concepts, code_examples, visual_diagrams)
Build an end-to-end, real-time voice assistant with long-term memory using MemU and the TEN Framework multi-modal conversational AI framework.
Tutorial: https://memu.pro/blog/build-real-time-voice-agent
Build a long-term memory Q&A assistant in just 20 lines of code using MemU and the all-in-one multi-agent framework LazyLLM. Tutorial: https://ai.feishu.cn/wiki/By6IwM7Kfinyf0kbM1xcrrcfnnd
By contributing to MemU, you agree that your contributions will be licensed under the Apache License 2.0.
For more information please contact [email protected]
-
GitHub Issues: Report bugs, request features, and track development. Submit an issue
-
Discord: Get real-time support, chat with the community, and stay updated. Join us
-
X (Twitter): Follow for updates, AI insights, and key announcements. Follow us
We're proud to work with amazing organizations:
Interested in partnering with MemU? Contact us at [email protected]

