Skip to content

Latest commit

 

History

History
1231 lines (996 loc) · 35.4 KB

advanced.md

File metadata and controls

1231 lines (996 loc) · 35.4 KB

Advanced Features

Welcome to LiteSwarm's advanced features documentation. This guide explores the powerful capabilities that enable you to build sophisticated AI applications. Whether you're orchestrating complex agent workflows, handling structured outputs, or building stateful chat applications, you'll find detailed explanations and practical examples here.

Table of Contents

  1. Building Agent Teams

  2. Structured Outputs

  3. Error Handling

  4. Type-Safe Context

  5. Tool Use and Agent Switching

  6. Chat API

  7. Building with LiteSwarm

Building Agent Teams

While LiteSwarm's core is lightweight and focused, it provides all the building blocks needed to create sophisticated agent teams. Let's explore how to build a managed team of specialized agents that can collaborate on complex tasks.

Core Concepts

  1. Tasks: Define structured work units with clear inputs and outputs:
from typing import Literal
from pydantic import BaseModel

class TaskBase(BaseModel):
    """Base class for all task types."""
    type: str
    id: str
    title: str
    description: str

    def build_prompt(self) -> str:
        """Build a prompt for this task."""
        raise NotImplementedError()

class AlgorithmTask(TaskBase):
    """Task for implementing algorithms."""
    type: Literal["algorithm_task"]
    complexity_requirements: list[str]
    performance_criteria: list[str]
  1. Outputs: Type-safe results from task execution:
class OutputBase(BaseModel):
    """Base class for all output types."""
    thoughts: list[str]
    filepath: str

class AlgorithmOutput(OutputBase):
    """Output from algorithm implementation."""
    code: str
    complexity_analysis: str
    performance_notes: str
  1. Specialized Agents: Agents with specific skills and instructions:
def create_algorithm_agent() -> Agent[None, AlgorithmOutput]:
    """Create an agent specialized in implementing algorithms."""
    return Agent(
        id="algorithm_engineer",
        instructions=ALGORITHM_ENGINEER_INSTRUCTIONS,
        llm=LLM(
            model="gpt-4o",
            response_format=AlgorithmOutput,
        ),
        output_type=AlgorithmOutput,
    )

Implementation

The core of agent team implementation is the orchestration class:

class AgentTeam:
    """Orchestrates how tasks get executed with the swarm."""

    def __init__(
        self,
        swarm: Swarm,
        planner: Agent[None, Plan],
        agents: dict[str, Agent[None, Any]],
    ) -> None:
        self.swarm = swarm
        self.planner = planner
        self.agents = agents

    async def create_plan(self, messages: list[Message]) -> Plan:
        """Use the planner agent to create an execution plan."""
        stream = self.swarm.stream(
            agent=self.planner,
            messages=messages,
            final_output_type=self.planner.output_type,
        )
        # ... handle streaming and return plan

    async def execute_task(
        self,
        task: TaskBase,
        messages: list[Message],
    ) -> OutputBase:
        """Execute a single task with the appropriate agent."""
        agent = self.agents[task.type]
        stream = self.swarm.stream(
            agent=agent,
            messages=messages + [Message(role="user", content=task.build_prompt())],
            final_output_type=agent.output_type,
        )
        # ... handle streaming and return output

Execution Flow

  1. Planning Phase: Break down the problem into tasks:
# Create team with planner and specialized agents
team = AgentTeam(
    swarm=swarm,
    planner=create_planner_agent(),
    agents={
        "algorithm_task": create_algorithm_agent(),
        "test_task": create_test_agent(),
        "documentation_task": create_documentation_agent(),
    },
)

# Create execution plan
plan = await team.create_plan(messages)
  1. Execution Phase: Execute tasks with specialized agents:
# Execute each task in the plan
results = []
for task in plan.tasks:
    output = await team.execute_task(task, messages)
    results.append(output)

Best Practices

  1. Clear Task Boundaries:

    • Define clear task types and responsibilities
    • Use Pydantic models for type safety
    • Include validation in task models
  2. Specialized Agents:

    • Give each agent focused instructions
    • Use appropriate response formats
    • Consider temperature settings based on task
  3. Robust Planning:

    • Provide clear planning criteria
    • Include dependencies between tasks
    • Allow for task prioritization
  4. Error Handling:

    • Handle agent failures gracefully
    • Provide fallback strategies
    • Log execution details for debugging

For a complete example of building an agent team for software development, see our agent_team example.

Structured Outputs

LiteSwarm provides a powerful two-layer approach to structured outputs:

  1. LLM's response_format defines what the model should return
  2. Agent's output_type and optional output_parser define what the agent produces

This separation allows for flexible output handling while maintaining type safety throughout the execution.

Basic Usage

Here's how to use both layers for maximum type safety:

from pydantic import BaseModel
from liteswarm import LLM, Agent, Message, Swarm


class MathResult(BaseModel):
    thoughts: str
    result: int


agent = Agent(
    id="math_expert",
    instructions="You are a math expert.",
    llm=LLM(
        model="gpt-4o",
        response_format=MathResult,  # Layer 1: What LLM produces
    ),
    output_type=MathResult,  # Layer 2: What agent returns
)

swarm = Swarm()
result = await swarm.run(
    agent=agent,
    messages=[Message(role="user", content="What is 2 + 2 * 2?")],
    final_output_type=MathResult,
)

if result.final_response.output:  # Type-safe MathResult
    print(f"Result: {result.final_response.output.result}")

Custom Output Parsing

You can use custom parsers when the LLM doesn't produce structured output:

from typing import cast

def parse_math_result(content: str) -> MathResult:
    """Parse free-form text into MathResult."""
    # Example: Extract result from "The answer is 42"
    import re
    match = re.search(r"answer is (\d+)", content)
    result = int(match.group(1)) if match else 0
    return MathResult(
        thoughts=content,
        result=result,
    )

agent = Agent(
    id="math_expert",
    instructions="You are a math expert.",
    llm=LLM(model="gpt-4o"),  # No response_format
    output_type=MathResult,
    output_parser=parse_math_result,  # Custom parser
)

Streaming Support

Both layers support streaming with different guarantees:

  1. LLM Response Format:

    • Streams partial JSON as it's being generated
    • May not satisfy schema until complete
    async for event in stream:
        if event.type == "agent_response_chunk":
            if event.response_chunk.parsed:
                partial = event.response_chunk.parsed  # Valid JSON object
  2. Agent Output Type:

    • Parsed after response is complete
    • Guaranteed to match output type
    result = await stream.get_return_value()
    if result.final_response.output:
        math_result = result.final_response.output  # Type: MathResult

Validation Helpers

LiteSwarm provides built-in validation through method parameters:

# Swarm methods accept final_output_type for validation
result = await swarm.run(
    agent=agent,
    messages=messages,
    final_output_type=MathResult,  # Validates final output
)

# By default, validation errors raise exceptions (strict mode)
try:
    result = await swarm.run(
        agent=agent,
        messages=messages,
        final_output_type=MathResult,
    )
except SwarmError as e:
    print(f"Output validation failed: {e}")

# You can disable strict mode to get warnings instead
swarm = Swarm(strict=False)
result = await swarm.run(
    agent=agent,
    messages=messages,
    final_output_type=MathResult,  # Will warn on validation failure
)

Agent Switching and Type Safety

When using agent switching, each agent can have its own output types:

class CodeResult(BaseModel):
    code: str
    language: str

@tool_plain
def switch_to_coder() -> ToolResult:
    return ToolResult.switch_to(
        Agent(
            id="coder",
            instructions="You write code.",
            llm=LLM(model="gpt-4o"),
            output_type=CodeResult,  # Different output type
        ),
    )

# Final result type depends on last agent
result = await swarm.run(
    agent=math_agent,  # Starts with MathResult
    messages=messages,
)

# Type depends on which agent finished
output = result.final_response.output
if output:
    if isinstance(output, MathResult):
        print(f"Math result: {output.result}")
    elif isinstance(output, CodeResult):
        print(f"Code result: {output.code}")

Provider Support

Not all LLM providers support structured outputs. When using providers that do (like OpenAI):

  1. The LLM will generate responses matching your schema
  2. Streaming partial parsing is handled automatically
  3. Final validation ensures type safety

For providers without schema support, you may need to:

  • Include schema in prompts
  • Handle parsing errors gracefully
  • Use more robust validation

For OpenAI compatibility, LiteSwarm provides helper methods:

from liteswarm.utils.pydantic import remove_default_values, restore_default_values

# Remove defaults before passing to OpenAI
schema_no_defaults = remove_default_values(YourModel)
# Restore defaults after receiving response
restored_output = restore_default_values(response, YourModel)

Provider Limitations

Some providers may have limitations that affect structured outputs:

  1. Token Limits:

    • Providers may have token limits that affect the size of responses
    • This can be mitigated by:
      • Breaking down large tasks into smaller chunks
      • Using streaming to handle large responses
  2. Schema Compatibility:

    • Providers may not support all Pydantic models
    • This can be mitigated by:
      • Using compatible models
      • Customizing output parsing

Output Strategies

LiteSwarm's built-in support for structured outputs focuses on two key features:

  1. LLM Response Format

    • Pass Pydantic models to response_format in LLM configuration
    • Automatic JSON schema generation and validation
    • Streaming-aware partial JSON parsing
  2. Execution Result Format

    • Type-safe validation via response_format in Swarm methods
    • Guaranteed parsed type in final execution results
    • Consistent across Core and Chat APIs

While these cover most use cases, we provide examples of alternative strategies in our playground:

# Example: Prompt engineering strategy
instructions = """
Generate a response in the following JSON format:
{
    "thoughts": "string",  # Your reasoning
    "result": "number"     # Final calculation
}

Rules:
1. Use valid JSON syntax
2. Include both fields
3. Ensure result is a number
"""

# Example: XML + JSON hybrid strategy
instructions = """
Structure your response as follows:

<thoughts>
Your step-by-step reasoning in natural language
</thoughts>

<result>
{
    "calculation": number,
    "confidence": number
}
</result>
"""

These examples demonstrate different approaches, but note that they are not part of LiteSwarm's core functionality. For production use, we recommend using the built-in response_format support when possible.

For practical examples of different strategies and handling provider limitations, see our advanced/structured_outputs example.

For complete examples of both approaches, see:

Error Handling

LiteSwarm provides comprehensive error handling through specialized error types that inherit from SwarmError. These types help you handle specific failure cases in your applications.

Core Error Types

  1. CompletionError:

    • Raised when LLM API calls fail permanently
    • Includes the original error for debugging
    try:
        result = await swarm.run(agent, messages)
    except CompletionError as e:
        print(f"API error: {e}")
        if e.original_error:
            print(f"Original error: {e.original_error}")
  2. ContextLengthError:

    • Raised when input exceeds model's context limit
    • Provides details about length constraints
    try:
        result = await swarm.run(agent, long_messages)
    except ContextLengthError as e:
        print(f"Context too long for {e.model}")
        print(f"Current: {e.current_length} > Max: {e.max_length}")
  3. MaxAgentSwitchesError:

    • Raised when too many agent switches occur
    • Helps prevent infinite switch loops
    try:
        result = await swarm.run(router_agent, messages)
    except MaxAgentSwitchesError as e:
        print(f"Too many switches: {e.switch_count}/{e.max_switches}")
        print(f"History: {e.switch_history}")
  4. MaxResponseContinuationsError:

    • Raised when response needs too many continuations
    • Prevents indefinite response growth
    try:
        result = await swarm.run(agent, complex_query)
    except MaxResponseContinuationsError as e:
        print(f"Response too long: {e.continuation_count}/{e.max_continuations}")
        if e.total_tokens:
            print(f"Total tokens: {e.total_tokens}")

Error Handling Strategies

  1. Catch Specific Errors:

    try:
        result = await swarm.run(agent, messages)
    except ContextLengthError:
        # Try reducing context or using a different model
        result = await fallback_strategy()
    except CompletionError:
        # Handle API failures (retry, use different provider)
        result = await retry_with_backoff()
    except SwarmError as e:
        # Handle any other Swarm-related errors
        logger.error(f"Swarm error: {e}")
  2. Validation Errors:

    • By default, Swarm runs in strict mode
    • Validation failures raise exceptions
    # Strict mode (default)
    try:
        result = await swarm.run(
            agent=agent,
            messages=messages,
            final_output_type=ExpectedType,
        )
    except SwarmError as e:
        print(f"Validation failed: {e}")
    
    # Non-strict mode (warnings only)
    swarm = Swarm(strict=False)
    result = await swarm.run(
        agent=agent,
        messages=messages,
        final_output_type=ExpectedType,  # Will warn on failure
    )
  3. Retry Handling:

    • Some operations automatically retry with backoff
    • RetryError indicates all retries failed
    try:
        result = await swarm.run(agent, messages)
    except RetryError as e:
        print(f"Failed after {e.attempts} attempts")
        print(f"Total duration: {e.total_duration}s")
        print(f"Strategy: {e.backoff_strategy}")

For a complete list of error types and their usage, see the exceptions module.

Type-Safe Context

LiteSwarm provides type-safe context through AgentContext and typed parameters. This allows you to share execution-specific data with tools and instruction builders while maintaining type safety.

Basic Usage

Define parameter types and use them with agents:

from dataclasses import dataclass
from liteswarm import Agent, AgentContext, LLM

@dataclass
class WeatherParams:
    fetch_weather: Callable[[str], str]

# Create agent with typed params
agent = Agent(
    id="weather_agent",
    instructions="You are a weather agent.",
    llm=LLM(model="gpt-4o"),
    params_type=WeatherParams,  # Specify params type
)

# Run with params instance
result = await swarm.run(
    agent=agent,
    messages=messages,
    params=WeatherParams(
        fetch_weather=lambda city: '{"temperature": 20, "condition": "sunny"}',
    ),
)

Tools with Context

Stateful tools receive typed AgentContext:

@tool  # Use @tool for context-aware functions
def fetch_weather(context: AgentContext[WeatherParams], city: str) -> str:
    # Access typed params
    return context.params.fetch_weather(city)

# Tools without context use @tool_plain
@tool_plain
def multiply(a: int, b: int) -> int:
    return a * b

Dynamic Instructions

Build instructions using context:

@dataclass
class UserParams:
    name: str

def instructions(context: AgentContext[UserParams]) -> str:
    user_name: str = context.params.name
    return f"Help the user, {user_name}, do whatever they want."

agent = Agent(
    id="agent",
    instructions=instructions,  # Dynamic instructions
    llm=LLM(model="gpt-4o"),
    params_type=UserParams,
)

# Run with params
result = await swarm.run(
    agent=agent,
    messages=messages,
    params=UserParams(name="John"),
)

Tool Return Values

Tools can return any JSON-serializable value or ToolResult:

from typing import TypedDict
from pydantic import BaseModel

class PydanticResult(BaseModel):
    name: str
    age: int

class TypedDictResult(TypedDict):
    name: str
    age: int

@dataclass
class DataclassResult:
    name: str
    age: int

# Tool can return any of these types
@tool_plain
def fetch_user(user_id: str) -> PydanticResult | TypedDictResult | DataclassResult | str | ToolResult:
    # Return any serializable value
    return PydanticResult(name="John", age=27)

Best Practices

  1. Type Safety:

    • Always specify params_type for agents that need context
    • Use dataclasses or Pydantic models for params
    • Let type checking catch errors early
  2. Context Usage:

    • Use @tool for context-aware functions
    • Use @tool_plain for stateless functions
    • Keep params focused and minimal
  3. Dynamic Instructions:

    • Use typed context for dynamic instructions
    • Keep instruction builders pure
    • Handle missing params gracefully

For complete examples, see:

Tool Use and Agent Switching

LiteSwarm provides a powerful tool system with two types of tools: context-free (@tool_plain) and context-aware (@tool). Tools can perform actions and enable switching between specialized agents.

Basic Tool Use

Tools are Python functions decorated with either @tool_plain or @tool:

from liteswarm import tool_plain, tool, Agent, AgentContext

# Context-free tool
@tool_plain
def multiply(a: int, b: int) -> int:
    """Simple tool returning a value."""
    return a * b

# Context-aware tool
@tool
def fetch_weather(context: AgentContext[WeatherParams], city: str) -> str:
    """Tool that uses context params."""
    return context.params.fetch_weather(city)

# Add tools to agent
agent = Agent(
    id="assistant",
    instructions="You can fetch weather and perform calculations.",
    llm=LLM(
        model="gpt-4o",
        tools=[fetch_weather, multiply],
        tool_choice="auto",  # Let agent decide when to use tools
    ),
)

Agent Switching

Agents can switch to other agents using ToolResult.switch_to:

@dataclass
class WeatherParams:
    fetch_weather: Callable[[str], str]

# Create specialized weather agent
weather_agent = Agent(
    id="weather_agent",
    instructions="You are a weather agent.",
    llm=LLM(model="gpt-4o"),
    tools=[fetch_weather],
    params_type=WeatherParams,
)

# Create manager agent that can switch
manager_agent = Agent(
    id="manager",
    instructions="You are a manager that can switch to other agents.",
    llm=LLM(model="gpt-4o"),
)

# Define switch tool
@tool(agent=manager_agent)
def switch_to_weather_agent(context: AgentContext) -> ToolResult:
    return ToolResult.switch_to(
        content="Switching to weather agent",
        agent=weather_agent,
        params=WeatherParams(
            fetch_weather=lambda city: '{"temperature": 20, "condition": "sunny"}',
        ),
    )

Tool Return Values

Tools can return any JSON-serializable value or a ToolResult:

  1. Simple Values:

    @tool_plain
    def add(a: int, b: int) -> int:
        return a + b
  2. Structured Data:

    @tool_plain
    def search(query: str) -> dict[str, Any]:
        return {
            "results": [
                {"title": "API Guide", "content": "..."},
            ]
        }
  3. Agent Switching:

    @tool
    def switch_to_expert(context: AgentContext[ExpertParams]) -> ToolResult:
        return ToolResult.switch_to(
            agent=expert_agent,
            content="Switching to expert",
            params=expert_params,
        )

Building Agent Networks

Since agent switching is one-way, you need to provide tools for routing between agents. Here's a basic example:

from liteswarm import Agent, AgentContext, LLM, tool, ToolResult

# Create specialized agents
ui_agent = Agent(
    id="ui",
    instructions="You are a UI expert.",
    llm=LLM(model="gpt-4o"),
)

backend_agent = Agent(
    id="backend",
    instructions="You are a backend expert.",
    llm=LLM(model="gpt-4o"),
)

# Create & register switch tools

@tool(agent=ui_agent)
def switch_to_backend_expert(context: AgentContext) -> ToolResult:
    """Switch the conversation to the backend expert."""
    return ToolResult.switch_to(
        backend_agent,
        content="Switching to backend expert",
        params=context.params,
    )

@tool(agent=backend_agent)
def switch_to_ui_expert(context: AgentContext) -> ToolResult:
    """Switch the conversation to the UI expert."""
    return ToolResult.switch_to(
        ui_agent,
        content="Switching to UI expert",
        params=context.params,
    )

Note that you can use the @tool(agent=my_agent) decorator to register tools for specific agents:

def create_agent_tools(agents: dict[str, Agent]) -> dict[str, Tool]:
    """Create tool functions to switch between agents."""
    
    def create_switch_function(agent: Agent) -> Tool:
        @tool(
            name=f"switch_to_{agent.id}",
            description=f"Switch the conversation to the {agent.id} agent.",
            params_type=agent.params_type,
        )
        def switch_to_agent(context: AgentContext) -> ToolResult:
            """Switch the conversation to the specified agent."""
            return ToolResult.switch_to(
                agent,
                content=f"Switching to {agent.id}",
                params=context.params,
            )
        return switch_to_agent

    return {f"switch_to_{name}": create_switch_function(agent) for name, agent in agents.items()}

# Create a team of specialized agents
agents = {
    "router": Agent(id="router", instructions="Route requests..."),
    "designer": Agent(id="designer", instructions="Design UI..."),
    "engineer": Agent(id="engineer", instructions="Implement features..."),
    "qa": Agent(id="qa", instructions="Test features..."),
}

# Create routing tools for each agent
tools = create_agent_tools(agents)
for agent in agents.values():
    agent.tools.extend(tools.values())

This approach allows you to:

  • Create a network of specialized agents
  • Automatically generate routing tools
  • Register tools with specific agents using @tool(agent=my_agent)
  • Maintain type safety with params_type
  • Pass context between agents

For a complete example of agent routing in a mobile development team, see mobile_dev_team example.

Best Practices

  1. Tool Design:

    • Use @tool_plain for stateless functions
    • Use @tool when context access needed
    • Return JSON-serializable values
    • Document tool behavior clearly
  2. Agent Switching:

    • Design one-way switching (no automatic returns)
    • Pass necessary params to target agent
    • Provide clear switch messages
    • Consider using a manager agent for orchestration
  3. Error Handling:

    • Tools can raise exceptions
    • Exceptions are handled based on Swarm's strict mode
    • Provide clear error messages for debugging

For complete examples, see:

Chat API

LiteSwarm provides a powerful Chat API for building stateful chat applications. While the Core API is stateless, the Chat API adds conversation state management, semantic search, and context optimization capabilities.

Core Components

The Chat API is built around three key protocols:

  1. Chat Protocol: Base interface for stateful conversations
from liteswarm import LLM, Agent, Message, SwarmChat

# Create chat instance
chat = SwarmChat()

# Send message and stream response
async for event in chat.send_message(
    message="Hello!",
    agent=Agent(
        id="assistant",
        instructions="You are a helpful assistant.",
        llm=LLM(model="gpt-4o"),
    ),
):
    if event.type == "agent_response_chunk":
        print(event.chunk.completion.delta.content, end="")

# Access conversation history
messages = await chat.get_messages()

# Search conversation history
relevant = await chat.search_messages(
    query="project requirements",
    max_results=5,
    score_threshold=0.7,
)

# Optimize context
optimized = await chat.optimize_messages(
    WindowStrategy(
        model="gpt-4o",
        window_size=50,
        preserve_recent=25,
    )
)
  1. ChatContext Protocol: Message storage and optimization
from liteswarm.types.chat import WindowStrategy, RAGStrategy

# Window-based optimization
messages = await context.optimize_context(
    WindowStrategy(
        model="gpt-4o",
        window_size=50,
        preserve_recent=25,
    )
)

# RAG-based optimization
messages = await context.optimize_context(
    RAGStrategy(
        model="gpt-4o",
        query="project requirements",
        max_messages=20,
        score_threshold=0.7,
    )
)
  1. MessageVectorIndex Protocol: Semantic search capabilities
# Search with vector index
results = await vector_index.search(
    query="deployment steps",
    max_results=5,
    score_threshold=0.7,
)

Type-Safe Optimization

The Chat API provides several type-safe optimization strategies:

  1. Window Strategy: Keep N most recent messages
WindowStrategy(
    model="gpt-4o",     # Model for token counting
    window_size=50,     # Total messages to keep
    preserve_recent=25, # Recent messages to preserve
)
  1. RAG Strategy: Semantic search with relevance filtering
RAGStrategy(
    model="gpt-4o",        # Model for token counting
    query="requirements",   # Search query
    max_messages=20,       # Max messages to return
    score_threshold=0.7,   # Minimum similarity score
)
  1. Trim Strategy: Token-based trimming with ratio control
TrimStrategy(
    model="gpt-4o",    # Model for token counting
    trim_ratio=0.5,    # How aggressively to trim
)
  1. Summary Strategy: Summarize older messages
SummaryStrategy(
    model="gpt-4o",        # Model for token counting
    preserve_recent=25,    # Recent messages to keep
)

Custom Implementations

Each protocol can be customized for specific needs:

from liteswarm.chat import ChatContext, SwarmChat
from liteswarm.types.chat import ChatMessage, OptimizationStrategy

class DatabaseContext(ChatContext):
    """Custom context using database storage."""
    
    async def add_message(self, message: Message) -> None:
        await self.db.insert(message)
    
    async def get_messages(self) -> list[Message]:
        return await self.db.fetch_all()
    
    async def optimize_context(
        self,
        strategy: OptimizationStrategy,
    ) -> list[Message]:
        messages = await self.get_messages()
        return await self._apply_strategy(messages, strategy)

# Use custom context
chat = SwarmChat(context=DatabaseContext())

Advanced Applications

The Chat API can be used to build sophisticated chat applications. Here are two complete examples:

  1. Chat Server with FastAPI:

    from fastapi import FastAPI, WebSocket
    from liteswarm import Agent, LLM, SwarmChat
    
    app = FastAPI()
    sessions: dict[str, SwarmChat] = {}
    
    @app.websocket("/chat/{session_id}")
    async def chat_endpoint(websocket: WebSocket, session_id: str):
        await websocket.accept()
        
        # Get or create chat session
        chat = sessions.get(session_id)
        if not chat:
            chat = SwarmChat()
            sessions[session_id] = chat
        
        # Handle messages
        async for message in websocket.iter_json():
            async for event in chat.send_message(
                message=message["content"],
                agent=create_agent(message.get("agent_type", "default")),
            ):
                if event.type == "agent_response_chunk":
                    await websocket.send_json({
                        "type": "chunk",
                        "content": event.chunk.completion.delta.content,
                    })
  2. Python Client Library:

    from dataclasses import dataclass
    from typing import AsyncIterator
    import websockets
    
    @dataclass
    class ChatClient:
        """Client for chat server API."""
        url: str
        session_id: str
    
        async def send_message(
            self,
            content: str,
            agent_type: str = "default",
        ) -> AsyncIterator[str]:
            """Send message and stream response."""
            async with websockets.connect(
                f"{self.url}/chat/{self.session_id}"
            ) as ws:
                await ws.send_json({
                    "content": content,
                    "agent_type": agent_type,
                })
                
                async for message in ws.iter_json():
                    if message["type"] == "chunk":
                        yield message["content"]
    
    # Use client
    client = ChatClient(url="ws://localhost:8000", session_id="user123")
    async for chunk in client.send_message("Hello!"):
        print(chunk, end="")

These examples demonstrate:

  • Session management with FastAPI
  • WebSocket streaming
  • Custom agent selection
  • Client library design
  • Error handling
  • Type safety

For complete implementations, see:

Best Practices

  1. Message Management:

    • Use ChatMessage for type-safe message handling
    • Preserve message order and relationships
    • Handle tool calls and responses properly
    • Keep track of message metadata
  2. Context Optimization:

    • Choose strategies based on use case:
      • Window: For simple chronological context
      • RAG: For knowledge-intensive tasks
      • Trim: For token limit management
      • Summary: For long conversations
    • Consider token limits and model capabilities
    • Preserve important context and relationships
  3. Search Integration:

    • Update vector index when messages change
    • Use appropriate similarity thresholds
    • Consider batching for efficiency
    • Handle embedding errors gracefully

For a complete example of building a chat application with the Chat API, see chat_intro example.

More Examples

For more examples and detailed documentation, see the examples directory.

Building with LiteSwarm

LiteSwarm provides a powerful foundation for building AI applications through two core components:

  1. Core API (Swarm):

    • Type-safe execution with streaming
    • Structured outputs with validation
    • Tool system with context support
    • Agent switching and error handling
  2. Chat API:

    • Stateful conversations
    • Semantic search with vector indexing
    • Type-safe context optimization
    • WebSocket streaming support

These components can be combined to build sophisticated applications:

  1. Stateful Chat Applications:

    # Chat with streaming and context management
    chat = SwarmChat()
    async for event in chat.send_message("Hello!", agent=agent):
        if event.type == "agent_response_chunk":
            print(event.chunk.completion.delta.content)
  2. Client-Server Applications:

    # FastAPI server with session management
    @app.websocket("/chat/{session_id}")
    async def chat_endpoint(websocket: WebSocket, session_id: str):
        chat = get_or_create_session(session_id)
        async for event in chat.send_message(message, agent):
            await stream_response(websocket, event)
  3. Multi-Agent Systems:

    # Create specialized agents with switching
    @tool(agent=router_agent)
    def switch_to_expert(context: AgentContext) -> ToolResult:
        return ToolResult.switch_to(
            expert_agent,
            content="Switching to expert",
            params=context.params,
        )

For practical examples and implementation patterns, explore our examples:

Coming Soon:

  • Agent Chains: Sequential execution of specialized agents
  • Agent Graphs: Complex agent workflows with branching and loops

For more information and complete implementations, see the examples directory.