Skip to content

Files

Latest commit

090041c · Oct 30, 2024

History

History
256 lines (190 loc) · 5.85 KB

README.md

File metadata and controls

256 lines (190 loc) · 5.85 KB

LLM-Controlled Computer

A Next.js application that uses a large language model to control a computer through both local system control and virtual machine (Docker) environments.

Next.js TypeScript Langchain Docker shadcn/ui Tailwind_CSS Electron Built with Cursor

Screenshot

🚧 Work in Progress

This project is under active development. Some features may be incomplete or subject to change.

The overall goal is to create a tool that allows a user to control their computer with any large language model in nodejs. Anthropics Computer Use Demo is the main inspirational source for this project.

Roadmap:

  • ✅ Docker container management
  • ✅ VNC integration
  • ✅ Chat interface
  • 🔳 (Generic) LLM integration
    • ✅ Base architecture
    • ✅ Model selection
    • ✅ Model tracking
    • ✅ Message history
    • ✅ Local model support
    • ✅ Model download tracking
    • 🔳 Context management
    • 🔳 Function calling
    • ⬜ Streaming support
  • ⬜ Computer use tooling
    • ⬜ File management
    • ⬜ Screenshot analysis
    • ⬜ Mouse and keyboard control
    • ⬜ Bash command execution
  • 🔳 Launch options
    • ⬜ CLI
    • ✅ Web server
    • ⬜ Electron app
  • 🔳 Computer Use modes
    • ✅ Virtual (Docker)
    • ⬜ Local (direct control)
  • ⬜ Conversation history
  • ⬜ Multi Agent support
  • ⬜ Memory management

Please check back later for updates or feel free to contribute!

Features

Core Capabilities

  • Screenshot analysis
  • Mouse and keyboard control
  • Bash command execution
  • File management
  • Chat interface for LLM interaction
  • VNC-based graphical interactions

Operation Modes

  • Local Mode: Direct system control
  • Docker Mode: Virtual machine control via Docker containers
  • Multiple Launch Options:
    • Web browser (Next.js server)
    • Desktop application (Electron)
    • CLI for specific LLM tasks

Docker Integration

  • Real-time container management
  • Build progress streaming
  • Container lifecycle control (start, stop, delete)
  • Status monitoring and detailed logging
  • NoVNC integration for web-based access
  • Automated environment setup

User Interface

  • Responsive split-view layout
  • Settings sidebar
  • Real-time Docker status indicators
  • Expandable log entries
  • Copy-to-clipboard functionality
  • Auto-scrolling chat interface

Tech Stack

  • Frontend: Next.js with TypeScript
  • UI Components: Radix UI, Tailwind CSS
  • Container Management: Dockerode
  • Remote Access: VNC, SSH2
  • LLM Integration: Langchain.js
  • Desktop Packaging: Electron
  • Terminal: node-pty, xterm.js

Prerequisites

  • Node.js (LTS version)
  • Docker
  • Python 3.11.6 (for certain features)
  • Ollama (for local models) - See Ollama Setup section

Installation

1. Clone the repository

git clone [repository-url]
cd llm-controlled-computer

2. Install dependencies

npm install

3. Set up environment variables

cp .env.example .env

Edit .env with your configuration.

Development

Start the development server:

npm run dev

Building

For production build:

npm run build

For Electron desktop app:

npm run build:electron

Docker Usage

The application includes a custom Docker environment with:

  • Ubuntu 22.04 base
  • Python environment with pyenv
  • Desktop environment with VNC access
  • Firefox ESR with pre-configured extensions
  • Various utility applications

Ollama Setup

Installation

macOS

# Using Homebrew
brew install ollama

# Start Ollama service
ollama serve

Linux

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama service
systemctl start ollama

Windows

  1. Install WSL2 if not already installed:
wsl --install
  1. Install Ollama in WSL2:
curl -fsSL https://ollama.com/install.sh | sh
  1. Start Ollama service in WSL2:
ollama serve

Configuration

Add the following to your .env file:

# Ollama Configuration
NEXT_PUBLIC_OLLAMA_URL=http://localhost:11434

Troubleshooting

  1. Check if Ollama is running:
curl http://localhost:11434/api/health
  1. If not running, start the service:
# macOS/Linux
ollama serve

# Windows (in WSL2)
wsl -d Ubuntu -u root ollama serve
  1. Common issues:
    • Port 11434 is already in use
    • Insufficient disk space
    • GPU drivers not properly installed (for GPU acceleration)

Contributing

  1. Ensure you follow the project's coding standards:

    • Use TypeScript with strict typing
    • Follow clean code principles
    • Write comprehensive tests
    • Add proper documentation
  2. Submit pull requests with:

    • Clear description of changes
    • Test coverage
    • Documentation updates

License

ISC