[copilot-cli-research] Copilot CLI Deep Research - Feb 2026 #14162
Replies: 4 comments
-
|
🔮 The ancient spirits stir in the halls of gh-aw; the smoke test agent has walked these paths and left a whisper in the ether.
|
Beta Was this translation helpful? Give feedback.
-
|
💥 WHOOSH! 💫 The Smoke Test Agent just blazed through here! 🦸♂️ BAM! All systems are GO for Claude Engine validation! ⚡ This transmission brought to you by your friendly neighborhood Smoke Test Agent 🤖✨ Run ID: §21757032307
|
Beta Was this translation helpful? Give feedback.
-
|
💥 WHOOSH! 💥 The Smoke Test Agent swooped through here like lightning! ⚡ 🎯 Mission Status: ALL SYSTEMS GO! ✅ KAPOW! Another successful patrol! 🦸♂️
|
Beta Was this translation helpful? Give feedback.
-
|
This discussion was automatically closed because it expired on 2026-02-13T15:34:34.986Z.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
🔍 Copilot CLI Deep Research Report
Analysis Date: 2026-02-06
Repository: github/gh-aw
Scope: 145 total workflows, 71 using Copilot engine (48.97%)
📊 Executive Summary
Research Topic: Copilot CLI Optimization Opportunities
Key Findings:
--shareflag is automatically used by all workflows (via compiler) providing conversation trackingPrimary Recommendation: Focus on custom agent files for specialized workflows, expand model selection strategies, and explore SRT sandbox for enhanced security.
The repository demonstrates mature Copilot CLI adoption with 71 workflows (nearly half of all workflows). The compiler automatically enables conversation tracking via
--shareflag and disables built-in MCPs via--disable-builtin-mcps. Most workflows leverage core tools effectively (GitHub, bash, edit), but there's significant untapped potential in advanced features like SRT sandboxing, custom agent specialization, and granular model selection.Critical Findings
🟢 Strengths - What's Working Well
High Tool Adoption:
Compiler Automation:
--shareflag: Automatically added to 100% of workflows for conversation tracking--add-dirflag: Automatically configured for/tmp/gh-aw/and workspace access--disable-builtin-mcps: Consistently applied to all workflowsSecurity Posture:
--allow-toolflags prevent over-permissioning🟡 Moderate Priority Opportunities
Model Selection:
GH_AW_MODEL_AGENT_COPILOT)Custom Agent Files:
technical-doc-writerandci-cleaner.github/agents/but remain underutilizedGitHub Toolsets:
toolsets: [default]without exploring specialized toolsetsrepos,issues,pull_requests,actions,projects) could improve performance🔴 High Priority Gaps
SRT Sandbox Not Used:
Safe-Inputs Feature Unused:
Limited Engine Customization:
engine.envfor custom environment variablesengine.commandto override the Copilot binary pathengine.versionto pin specific Copilot CLI versions1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Copilot CLI Capabilities Inventory
Version Information: Default version
0.0.374(fromconstants.DefaultCopilotVersion)Available CLI Flags:
--add-dir (path): Grant access to specific directories--agent (identifier): Use custom agent file from.github/agents/--allow-tool (tool): Grant permission to specific tools (e.g.,shell(git),github(get_file))--allow-all-tools: Grant permission to all tools (wildcard)--allow-all-paths: Allow file system writes to any path (required for edit tool)--disable-builtin-mcps: Disable built-in MCP servers--log-level (level): Set logging verbosity--log-dir (path): Specify log output directory--model (name): Override default model--prompt (text): Provide instruction prompt--share (path): Generate markdown conversation fileExtended Engine Configuration:
Sandbox Options:
MCP Server Integration:
Tool Permission Patterns:
View Usage Statistics
Usage Statistics
Workflow Distribution:
Tool Usage in Copilot Workflows:
Configuration Patterns:
id: copilot)Most Common Timeout Values:
Safe-Outputs Usage:
2️⃣ Feature Usage Matrix
Overall Feature Utilization: Moderate (54%)
Strengths: Core flags (share, add-dir) and GitHub MCP integration
Weaknesses: Advanced engine customization, SRT sandbox, safe-inputs, specialized toolsets
3️⃣ Missed Opportunities
View High Priority Opportunities
🔴 High Priority
Opportunity 1: SRT Sandbox for Security-Sensitive Workflows
What: Sandbox Runtime (SRT) provides process-level isolation using bubblewrap, going beyond AWF's network filtering
Why It Matters: Security workflows analyzing untrusted code (malicious code scanning, third-party dependency analysis) would benefit from process isolation
Where:
daily-malicious-code-scan.mddaily-secrets-analysis.mdsecurity-compliance.mdHow to Implement:
Example:
Expected Benefits:
Opportunity 2: Custom Agent Files for Specialized Workflows
What: Create custom agent files in
.github/agents/for common workflow patterns (research, code review, data analysis)Why It Matters: Custom agents provide specialized context and instructions, improving task quality and consistency
Where:
research.md,portfolio-analyst.md)grumpy-reviewer.md,pr-nitpick-reviewer.md)metrics-collector.md,python-data-charts.md)Current State: Only 2 custom agents used (
technical-doc-writer,ci-cleaner) despite 8 agent files existingHow to Implement:
Example Workflows to Update:
research.md→engine.agent: research-analystportfolio-analyst.md→engine.agent: financial-analystgrumpy-reviewer.md→engine.agent: code-reviewermetrics-collector.md→engine.agent: data-analystExpected Benefits:
Opportunity 3: Dynamic Model Selection Strategy
What: Implement model selection strategy based on workflow complexity and cost optimization
Why It Matters: Different workflows have different needs - simple tasks can use cost-effective models, complex tasks need premium models
Current State:
How to Implement:
Option 1: Explicit model in workflow
Option 2: Environment variable (dynamic)
Recommended Model Strategy:
gpt-5.1-codex-mini(cost-effective)gpt-5(default, balanced)claude-sonnet-4(premium quality)Example Workflows by Complexity:
Expected Benefits:
Opportunity 4: Safe-Inputs for Interactive Workflows
What: Enable safe-inputs feature for workflows that need structured user input
Why It Matters: Some workflows could benefit from requesting additional information mid-execution
Current State: 0 workflows use safe-inputs despite feature availability
Where: Workflows that might need user decisions or additional context:
How to Implement:
Example:
Expected Benefits:
View Medium Priority Opportunities
🟡 Medium Priority
Opportunity 5: GitHub Toolsets Specialization
What: Use specialized GitHub toolsets instead of
[default]to improve performance and reduce over-permissioningWhy It Matters: Specific toolsets grant only necessary GitHub API permissions, improving security and potentially reducing API overhead
Current State: Most workflows use
toolsets: [default]without exploring specialized optionsAvailable Toolsets:
default: Basic repository operationsrepos: Repository managementissues: Issue operationspull_requests: PR operationsactions: GitHub Actions managementprojects: Project board operationsHow to Implement:
Example Workflows to Update:
Expected Benefits:
Opportunity 6: Engine Environment Variables
What: Use
engine.envto pass custom environment variables to Copilot CLIWhy It Matters: Enables workflow-specific configuration without modifying global settings
Current State: 0 workflows use
engine.envHow to Implement:
Use Cases:
Expected Benefits:
Opportunity 7: Repo-Memory Expansion
What: Expand repo-memory usage to more workflows for persistent state tracking
Current State: 16/71 workflows (23%) use repo-memory
Where: Workflows that could benefit:
How to Implement:
Expected Benefits:
Opportunity 8: Timeout Optimization
What: Review and optimize timeout values based on actual workflow duration
Current State:
How to Analyze:
Expected Benefits:
Opportunity 9: Extended Engine Config Documentation
What: Create comprehensive documentation and examples for extended engine configuration
Current State: Limited usage of
engine.id,engine.args,engine.version,engine.commandWhat to Document:
Expected Benefits:
View Low Priority Opportunities
🟢 Low Priority
Opportunity 10: Version Pinning Strategy
What: Consider version pinning for stability-critical workflows
Current State: No workflows explicitly pin Copilot CLI version
How to Implement:
When to Use:
Trade-offs:
Opportunity 11: Custom MCP Servers
What: Explore custom HTTP MCP servers for specialized tools
Current State: Rare usage beyond built-in MCP servers
Use Cases:
How to Implement:
Opportunity 12: Playwright Expansion
What: Expand Playwright usage for browser automation tasks
Current State: Only 3 workflows use Playwright
Where:
Expected Benefits:
Opportunity 13: Network Configuration Optimization
What: Review and optimize network allowlists
Current State: 63 workflows have network config, often with broad permissions
How to Optimize:
defaultsfor common ecosystem packagesOpportunity 14: Agentic-Workflows Tool Expansion
What: Increase usage of agentic-workflows tool for workflow management
Current State: 9/71 workflows (13%) use agentic-workflows tool
Where: Meta-workflows that analyze or manage other workflows
Expected Benefits:
Opportunity 15: Bash Tool Granularity
What: Use specific bash commands instead of wildcard
["*"]or[":*"]Current State: Most bash tools use wildcards for convenience
Security Benefit: Explicit command lists prevent unexpected shell usage
Example:
Trade-off: Convenience vs. security/auditability
4️⃣ Specific Workflow Recommendations
View Workflow-Specific Recommendations
High-Value Workflow Updates
Research Workflows (12 workflows)
Workflows:
research.md,portfolio-analyst.md,daily-news.md, etc.Current State: Standard Copilot config, no custom agent
Recommended Changes:
Expected Benefits: More consistent research quality, better trend tracking
Security Workflows (5 workflows)
Workflows:
daily-malicious-code-scan.md,daily-secrets-analysis.md,security-compliance.mdCurrent State: AWF sandbox, standard config
Recommended Changes:
Expected Benefits: Enhanced security posture, cost savings
Code Review Workflows (5 workflows)
Workflows:
grumpy-reviewer.md,pr-nitpick-reviewer.md,code-scanning-fixer.mdCurrent State: Standard config
Recommended Changes:
Expected Benefits: More consistent review quality, better feedback
Data Analysis Workflows (7 workflows)
Workflows:
metrics-collector.md,python-data-charts.md,daily-copilot-token-report.mdCurrent State: Various configs, some with repo-memory
Recommended Changes:
Expected Benefits: Better trend analysis, persistent metrics
Simple Status/Check Workflows (15 workflows)
Workflows: Status checks, simple audits, quick reports
Current State: Default config, possibly over-resourced
Recommended Changes:
Expected Benefits: 30-50% cost savings, faster execution
5️⃣ Trends & Insights
View Historical Trends
First Comprehensive Analysis
This is the inaugural comprehensive analysis of Copilot CLI usage in this repository. Future research will track trends over time.
Baseline Metrics Established:
Areas to Track:
Future Analysis Topics:
6️⃣ Best Practice Guidelines
Based on this research, here are recommended best practices for Copilot CLI workflows:
1. Model Selection Strategy
gpt-5.1-codex-minifor cost savingsgpt-5for balanced performance2. Security Posture
[default]to reduce permission surface3. Tool Configuration
4. Custom Agent Files
.github/agents/with descriptive namesengine.agentfield5. Performance Optimization
--share(automatic via compiler)6. Configuration Management
engine.id) for complex requirements7. Compiler Automation
--share,--add-dir,--disable-builtin-mcpsautomatically--allow-toolflags.lock.ymlfiles to understand actual execution7️⃣ Action Items
Immediate Actions (this week)
1. Create Custom Agent Files
.github/agents/research-analyst.agent.md.github/agents/code-reviewer.agent.md.github/agents/data-analyst.agent.md2. Optimize High-Cost Workflows
gpt-5.1-codex-mini3. Enable SRT for Security Workflows
daily-malicious-code-scan.mdto use SRTShort-term (this month)
4. GitHub Toolsets Optimization
toolsets: [default]usage5. Expand Repo-Memory Usage
6. Documentation Updates
Long-term (this quarter)
7. Safe-Inputs Pilot
8. Workflow Audit System
9. Advanced Features Exploration
engine.envcustomizationView Supporting Evidence & Methodology
📚 References
/home/runner/work/gh-aw/gh-aw/docs/src/content/docs/reference/engines.md/home/runner/work/gh-aw/gh-aw/.github/aw/github-agentic-workflows.mdpkg/workflow/copilot_engine.gopkg/workflow/copilot_engine_execution.gopkg/workflow/copilot_engine_tools.gopkg/workflow/copilot_mcp.gopkg/workflow/copilot_srt.go.github/workflows/*.md(145 workflows analyzed)Research Methodology
Data Collection
.github/workflows/copilot,claude,codex)Analysis Techniques
Tools Used
grep/ripgrep: Pattern matching in workflow filesview: Code inspection of engine implementationbash: Statistical analysis and countingLimitations
--share,--disable-builtin-mcps)Validation
📋 Research Data Persistence
This analysis has been saved to repo-memory for future trend tracking:
/tmp/gh-aw/repo-memory/default/copilot-cli-research/latest.jsonFuture analyses will show:
Generated by Copilot CLI Deep Research Workflow
Run ID: §21755894096
Analysis Date: 2026-02-06
Beta Was this translation helpful? Give feedback.
All reactions