ArtAgents: Agent-Based Creative Toolkit

ArtAgents is a prototype framework designed for artists, designers, and creators to experiment with LLM-based prompt engineering and creative content generation. It leverages Ollama for local model serving, allowing users to interact with various text and multimodal models through specialized AI 'agents' and structured, configurable workflows ("Teams").

Overview

Select predefined agents, load custom agents, or utilize multi-agent "Teams" to generate detailed prompts, descriptions, image captions, or other text outputs. Provide text instructions and optionally images as input. Fine-tune generation using Ollama API parameters, prompt style limiters, and agent presets. Experiment systematically using the Sweep feature and manage image captions directly within the application.

Recent Updates

(v0.9.5-alpha, October 2025)

This project has undergone a significant technical upgrade and has been enhanced with new creative capabilities.

Major Technical Upgrade: Gradio 5.x+

The application has been successfully migrated from the legacy Gradio 3.x framework to a modern Gradio 5.x+ version, refactoring the user interface and event handling system.

Enhanced Security: The app now operates under Gradio's secure file-access model.
Improved Performance & Stability: The new version provides a more robust and performant foundation for future development.

New Feature: Creative Synthesis Strategies & Teams

To boost the experimental and artistic value of experiments, we have implemented several new creative assembly strategies. These move beyond simple description towards transformative and conceptually-driven prompt engineering.

Three new teams have been added to agent_teams.json to leverage these strategies:

Creative - Metaphorical Vision
- Strategy: metaphorical_synthesis
- How it works: This team gathers rich, multi-sensory input (mood, color, texture) and reinterprets it through a dynamically chosen creative metaphor. It's excellent for generating abstract and evocative results that break creative blocks.
Creative - Hybrid Concept Factory
- Strategy: conceptual_blend
- How it works: This team is built to create productive conflict by forcing the final agent to blend three distinct concepts: a concrete object, a broad world/style, and an abstract theme. This structure is a recipe for generating genuinely unique and unexpected ideas.
Creative - Themed Content Writer
- Strategy: stylistic_mashup
- How it works: This strategy separates what is being described from how it is described. The team builds a complete, detailed picture of a scene's content, and the final synthesis step reframes it by rewriting the entire prompt in a dynamically chosen literary or textual style, creating a powerful juxtaposition between substance and style.

Key Features

Core Functionality:

Ollama Integration: Connects to a running Ollama instance to utilize locally served LLMs (text & multimodal) with startup check.
Agent System: Define and use specialized agents (Designer, Photographer, Styler, etc.) with unique instructions and optional API overrides (agent_roles.json, custom_agent_roles.json).
Agent Team / Workflow Execution: Define (agent_teams.json) and run multi-step agent sequences ("Teams"). Supports sequential execution with context passing and multiple result assembly strategies (concatenate, refine_last, summarize_all, structured_concatenate, and other innovative experimental strategies).
Team Editor: Create, edit, save, and delete Agent Teams via a dedicated UI tab.
Chat Interface: Main tab for direct interaction with selected agents or teams, including session history and response refinement.
Multimodal Input: Supports single image upload or processing images within a specified folder for chat or captioning context.
Image Captioning: Dedicated tab to load images from a folder, view/edit associated .txt caption files, save changes, and generate captions using selected agents/teams and vision models.
Experiment Sweeps: Systematically run base prompts across multiple selected Agent Teams and Worker Models. Saves detailed JSON protocol files for each run and separate .txt files containing the raw generated prompts per model.
Configuration Management: External JSON files for easy customization of settings, models, limiters, API profiles, agent roles, and agent teams.
App Settings UI: Dedicated tab to configure Ollama URL, agent loading preferences, default behaviors, UI theme, and detailed Ollama API parameters (with loadable profiles).
Persistent History: Logs all single interactions and detailed workflow steps to core/history.json, viewable and clearable in the "Full History" tab.
Utilities: Copy-to-clipboard for responses, optional prompt artifact cleaning, model release functions, contextual help tooltips, setup scripts.
Modular Codebase: Organized structure (core, agents, ui) for maintainability.

Project Structure

ArtAgent/
│
├── app.py                  # Main Gradio App: UI Structure, Event Wiring, State Mgmt
├── requirements.txt        # Python Dependencies (Consider migrating to pyproject.toml/Poetry)
├── settings.json           # App Config: Ollama URL, defaults, global API opts, theme
├── models.json             # Ollama models known to the app (name, vision)
├── limiters.json           # Prompt style limiters (name, tokens, format string)
├── ollama_profiles.json    # Presets for Ollama API options
├── agent_teams.json        # Stores PREDEFINED & USER-SAVED Agent Team/Workflow definitions
│
├── agents/                 # --- Agent Logic & Definitions ---
│   ├── __init__.py
│   ├── roles_config.py     # Logic to load/merge roles
│   ├── ollama_agent.py     # Interacts with Ollama API (get_llm_response)
│   ├── agent_roles.json    # Default agent definitions
│   ├── custom_agent_roles.json # User's custom persistent agents
│   └── examples/           # --- Optional: Example Agent Files ---
│       └── *.json
│
├── core/                   # --- Core Logic & Utilities ---
│   ├── __init__.py
│   ├── app_logic.py        # Callback logic functions (router, UI callbacks)
│   ├── refinement_logic.py # Logic for comment/refinement feature
│   ├── agent_manager.py    # Orchestrates Agent Team Workflows
│   ├── captioning_logic.py # Logic for caption editing & generation
│   ├── history_manager.py  # Loads/saves persistent history
│   ├── ollama_checker.py   # Ollama startup check logic
│   ├── ollama_manager.py   # Ollama model release logic
│   ├── sweep_manager.py    # Logic for running experiment sweeps
│   ├── utils.py            # Common utilities (JSON loading, cleaning etc.)
│   ├── help_content.py     # Stores help text for UI
│   └── history.json        # Persistent history data file
│
├── ui/                     # --- UI Tab Definitions (Gradio components) ---
│   ├── __init__.py
│   ├── chat_tab.py
│   ├── captions_tab.py     # UI for caption editing & generation
│   ├── team_editor_tab.py  # UI for editing teams
│   ├── sweep_tab.py        # UI for experiment sweeps
│   ├── history_tab.py
│   ├── info_tab.py         # Consolidated info tab (replaces roles_tab.py)
│   ├── app_settings_tab.py
│   └── common_ui_elements.py
│
├── scripts/                # --- Utility & Setup Scripts ---
│   ├── (Batch files: setup.bat, setupvenv.bat, go.bat, govenv.bat)
│   └── full_project_creator.py
│   └── (Optional: .sh equivalents)
│
├── docs/                   # --- Detailed Documentation ---
│   ├── index.md            # Overview (Placeholder)
│   ├── user-guide.md       # User manual (Placeholder)
│   ├── architecture.md     # System design (Placeholder)
│   └── api.md              # Core function details (Placeholder, Optional)
│
├── sweep_runs/             # Default Output folder for Sweep Protocols (add to .gitignore)
│
├── tests/                  # --- Automated Tests ---
│   ├── __init__.py
│   └── test_agent.py       # Example tests (Needs Expansion)
│   └── (Placeholder: other test files)
│
├── .gitignore
└── README.md               # This file

Installation & Setup

Install Ollama: Download and install from ollama.com. Ensure the ollama command is available in your terminal.
Clone Repository: git clone https://github.com/sandner-art/ArtAgents.git and navigate into the ArtAgent directory (cd ArtAgent).
Setup Python Environment (Recommended):
- Using Venv (Manual): Create and activate a virtual environment (Python 3.9+ recommended, 3.10+ required for potential Gradio 5 upgrade).
```
python -m venv venv
# On Windows: .\venv\Scripts\activate
# On Linux/macOS: source venv/bin/activate
```
  Then install requirements:
```
pip install --upgrade pip
pip install -r requirements.txt
```
- (Alternative) Using Scripts: Run .\scripts\setupvenv.bat (Windows) or equivalent .sh script to automate venv creation and pip install.
- (Future) Using Poetry: If Poetry is implemented, replace step 3 with poetry install.
Setup Ollama Models: Run .\scripts\setup.bat (Windows) or equivalent .sh script. This checks Ollama connectivity and downloads recommended models listed in models.json. Alternatively, use ollama pull <model_name> manually for desired models.
Configure (Optional): Review and edit JSON files (settings.json, models.json, agent_teams.json, etc.) to customize the application.

Running the Application

Start Ollama Service: Ensure the Ollama service is running (e.g., launch the Ollama Desktop application or run ollama serve in a separate terminal).
Activate Environment: If using a virtual environment, activate it (source venv/bin/activate or .\venv\Scripts\activate).
Run ArtAgents:
- If using venv: python app.py
- Using Scripts: .\scripts\govenv.bat (Windows) or equivalent .sh script.
- (Future) Using Poetry: poetry run python app.py
Access UI: Open the local URL provided in the console (usually http://127.0.0.1:7860) in your web browser.

Documentation

For more detailed information, please refer to the documents in the /docs directory:

/docs/user-guide.md
/docs/architecture.md

Development Status & Plan

Phase 0: Stabilization & Core Refinement (Complete)

Agent Captioning functionality stabilized.
Agent Team Editor implemented and stabilized.
Core assembly strategies (concatenate, refine_last, summarize_all, structured_concatenate) tested.
Sweep output format implemented (per-model .txt prompt files + JSON protocols).
Optional prompt artifact cleaner added.
Copy-to-clipboard button added.
Consolidated "Info" tab implemented.
Error handling reviewed and improved.
Gradio 5.x Upgrade: Evaluate and execute upgrade from Gradio 3.x.

Phase 1: Foundational Expansion & Modernization (Current Focus)

Implement Select Novel Synthesis Strategies: Add 2-3 creative strategies (e.g., Metaphorical Synthesis, Conceptual Blending) to agent_manager.py and Team Editor UI.
NLP Library Integration (nlpaug): Integrate for noise/synonym capabilities within strategies or as agent steps.
Unit Testing Expansion: Write comprehensive pytest tests for core logic and new features.

Future / Planned Enhancements (Phase 2+):

Advanced Agent Teams (Hierarchical agents, conditional logic, feedback loops).
Advanced Experimentation (Parameter sweeping via Hydra, potentially MLFlow integration).
Direct Image Generation API Integration (e.g., ComfyUI, A1111).
Workflow Visualization.
Enhanced UI/UX (Improved Team Editor, potential Gradio custom components).
Explainability / XAI Features.
More Novel Synthesis Strategies & NLP features.
Hydra Integration: Migrate .json configurations to Hydra (.yaml) for improved experiment management.

Contributing

Contributions are welcome! Please refer to CONTRIBUTING.md for guidelines on reporting issues, suggesting features, or submitting pull requests.

License

sandner.art | AI/ML Articles

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ArtAgents: Agent-Based Creative Toolkit

Overview

Recent Updates

Major Technical Upgrade: Gradio 5.x+

New Feature: Creative Synthesis Strategies & Teams

Key Features

Project Structure

Installation & Setup

Running the Application

Documentation

Development Status & Plan

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
agents		agents
core		core
docs		docs
scripts		scripts
tests		tests
ui		ui
utility		utility
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
agent_teams.json		agent_teams.json
app.py		app.py
limiters.json		limiters.json
models.json		models.json
ollama_profiles.json		ollama_profiles.json
requirements.txt		requirements.txt
settings.json		settings.json

License

sandner-art/ArtAgents

Folders and files

Latest commit

History

Repository files navigation

ArtAgents: Agent-Based Creative Toolkit

Overview

Recent Updates

Major Technical Upgrade: Gradio 5.x+

New Feature: Creative Synthesis Strategies & Teams

Key Features

Project Structure

Installation & Setup

Running the Application

Documentation

Development Status & Plan

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Languages

Packages