[Workflow Skill Extractor] Analysis Report - 145 Workflows Analyzed #13819

2026-02-05T00:06:38Z

github-actions[bot]
bot Feb 5, 2026

Workflow Skill Extractor Report

🎯 Executive Summary

This analysis examined 145 agentic workflows and 58 existing shared components to identify opportunities for extracting reusable skills and reducing code duplication. The repository demonstrates strong adoption of shared components with 96 workflows (66%) already using imports, and 40+ workflows leveraging the standardized reporting guidelines.

Key Findings:

Existing shared components are highly effective: shared/reporting.md, shared/issues-data-fetch.md, and shared/copilot-session-data-fetch.md are success stories
3 high-impact opportunities identified for new shared components
Estimated total lines saved: 900+ lines across 20+ workflows
Primary gaps: PR data fetch, workflow run data fetch, and discussions data fetch

Key Statistics:

Total workflows analyzed: 145
Existing shared components: 58 (37 core + 21 MCP)
Workflows using imports: 96 (66%)
High-priority recommendations: 3
Estimated total lines saved: 900+

📊 Analysis Overview

Workflows Analyzed

The analysis covered the full spectrum of agentic workflows including:

Daily Report Workflows (30+):

daily-issues-report.md, daily-cli-performance.md, daily-code-metrics.md
daily-compiler-quality.md, daily-copilot-token-report.md, daily-firewall-report.md
And 25+ more daily reporting workflows

Security & Compliance Workflows (10+):

security-guard.md, security-compliance.md, code-scanning-fixer.md
daily-secrets-analysis.md, daily-malicious-code-scan.md, daily-semgrep-scan.md

Triage & Analysis Workflows (15+):

issue-classifier.md, auto-triage-issues.md, issue-triage-agent.md
pr-triage-agent.md, ci-doctor.md, copilot-pr-merged-report.md

Meta-Orchestrator Workflows (8+):

workflow-health-manager.md, workflow-normalizer.md, mcp-inspector.md

Test & Smoke Workflows (8+):

smoke-copilot.md, smoke-claude.md, smoke-codex.md, smoke-opencode.md

Existing Shared Components

The repository already has 58 shared components organized into categories:

Core Shared Components (37 files in shared/):

Data Fetching: issues-data-fetch.md, copilot-session-data-fetch.md, pr-data-safe-input.md, weekly-issues-data-fetch.md, copilot-pr-data-fetch.md, discussions-data-fetch.md, github-queries-safe-input.md
Visualization: python-dataviz.md, charts-with-trending.md, trends.md, trending-charts-simple.md, session-analysis-charts.md
Reporting: reporting.md (40+ imports), keep-it-short.md, use-emojis.md
Analysis: ci-data-analysis.md, ci-optimization-strategies.md, metrics-patterns.md, session-analysis-strategies.md, token-cost-analysis.md
Tools: gh.md, jqschema.md, go-make.md, genaiscript.md, sq.md
Configuration: app-config.md, changeset-format.md, github-mcp-app.md, safe-output-app.md, docs-server-lifecycle.md
Engines: opencode.md, ollama-threat-scan.md, actions-ai-inference.md
Testing: secret-redaction-test.md
MCP: mcp-debug.md, mcp-pagination.md, ffmpeg.md

MCP Server Components (21 files in shared/mcp/):

arxiv.md, ast-grep.md, azure.md, brave.md, chroma.md, context7.md
datadog.md, deepwiki.md, drain3.md, fabric-rti.md, jupyter.md
markitdown.md, microsoft-docs.md, notion.md, semgrep.md, sentry.md
server-memory.md, skillz.md, slack.md, svelte.md, tavily.md

🔍 Identified Skills

High Priority Skills

1. PR Data Fetch Shared Component

Frequency: Used in 10+ workflows
Size: ~400 lines could be saved
Priority: HIGH
Description: Standardize GitHub pull request data fetching with caching pattern

Currently, workflows like copilot-pr-merged-report.md, pr-triage-agent.md, security-guard.md, and others each implement their own PR data fetching logic. This leads to inconsistent data structures and duplicated API calls.

Workflows Using This Pattern:

copilot-pr-merged-report.md - Fetches merged PRs from last 24 hours
pr-triage-agent.md - Fetches PRs for triage
copilot-pr-nlp-analysis.md - Analyzes PR text content
copilot-pr-prompt-analysis.md - Analyzes PR prompts
security-guard.md - Analyzes PR security posture
copilot-agent-analysis.md - Analyzes Copilot-created PRs
security-compliance.md - Checks PR security compliance
pr-nitpick-reviewer.md - Reviews PRs for nitpicks
draft-pr-cleanup.md - Cleans up stale draft PRs
grumpy-reviewer.md - Reviews PRs with critical eye

Recommendation: Create shared/pr-data-fetch.md following the successful pattern of shared/issues-data-fetch.md and shared/copilot-session-data-fetch.md.

GitHub Issue Created: #aw_pr_data_fetch

2. Workflow Run Data Fetch Shared Component

Frequency: Used in 8+ workflows
Size: ~300 lines could be saved
Priority: HIGH
Description: Standardize GitHub Actions workflow run data fetching with caching

Workflows that monitor, analyze, or report on GitHub Actions workflow runs currently duplicate the logic for querying the Actions API, processing run data, and caching results.

Workflows Using This Pattern:

ci-doctor.md - Investigates failed CI workflow runs
daily-cli-performance.md - Analyzes CLI compilation performance
workflow-health-manager.md - Monitors health of all workflows
daily-firewall-report.md - Reports on firewall configurations
daily-observability-report.md - Generates observability metrics
metrics-collector.md - Collects workflow execution metrics
repo-audit-analyzer.md - Audits repository configurations
dev-hawk.md - Monitors development workflow patterns

Recommendation: Create shared/workflow-runs-data-fetch.md with caching strategy similar to existing data-fetch components.

GitHub Issue Created: #aw_workflow_runs_fetch

3. Discussions Data Fetch Shared Component

Frequency: Used in 5+ workflows
Size: ~200 lines could be saved
Priority: MEDIUM
Description: Standardize GitHub Discussions data fetching (may already exist)

Note: The file shared/discussions-data-fetch.md already exists! This recommendation is to verify its implementation quality and promote its adoption across workflows that currently implement their own discussions fetching logic.

Workflows That Could Benefit:

discussion-task-miner.md - Mines discussions for tasks
daily-issues-report.md - Includes discussions in reports
weekly-issue-summary.md - Summarizes issues and discussions
daily-fact.md - Posts to discussions
auto-triage-issues.md - Creates discussion reports

Recommendation: Review existing shared/discussions-data-fetch.md and enhance documentation if needed. Promote adoption across workflows.

GitHub Issue Created: #aw_discussions_fetch

Medium Priority Skills

4. Report Formatting Guidelines ✅

Status: ALREADY IMPLEMENTED as shared/reporting.md
Frequency: Imported by 40+ workflows
Impact: SUCCESS STORY

This is an excellent example of successful skill extraction! The shared/reporting.md component provides:

Header level guidelines (use h3, not h1/h2)
Progressive disclosure patterns with <details><summary> tags
Report structure patterns
Design principles (Airbnb-inspired)

No action needed - this demonstrates the value of shared components!

5. Custom Safe-Output Messages

Frequency: 40+ workflows define custom messages
Priority: LOW
Description: Themed messages for workflow personality

Many workflows define creative, themed custom messages:

ci-doctor.md: Medical theme (🩺 🏥 💊)
security-guard.md: Security theme (🛡️ 🔒)
daily-fact.md: Poetic theme (🪶 📜 ✨)
brave.md: Search theme (🦁 🔍)

Recommendation: DO NOT EXTRACT - Message customization is part of workflow personality and should remain workflow-specific. Standardizing messages would reduce the delight and character of individual workflows.

6. Network Configuration Patterns

Frequency: 66+ workflows define network permissions
Priority: LOW
Description: Common network allowlists for GitHub API access

Recommendation: DO NOT EXTRACT - Network configurations represent security boundaries and should be explicitly declared per workflow. Over-sharing could obscure critical security requirements.

Low Priority Skills

View Additional Patterns Considered

GitHub Toolsets Configuration

Pattern: toolsets: [default] appears in 39 workflows
Assessment: Already minimal (2-3 lines). Low value for extraction.

Bash Command Allowlists

Pattern: bash: ["jq *", "gh api *", "date *", "mkdir *", "cp *"]
Assessment: Too workflow-specific. Each workflow has unique bash command needs.

Safe-Output Discussion Patterns

Pattern: create-discussion with close-older-discussions: true
Assessment: Only 4-5 lines and varies significantly (category, title-prefix, expires).

📈 Impact Analysis

By Category

Category	Skills	Lines Saved	Workflows Affected
Data Processing	3	900	20+
Prompt Skills	1 (exists)	0 (already shared)	40+
Tool Configurations	0 (not recommended)	0	0

By Priority

Priority	Skills	Lines Saved	Workflows Affected	Status
High	2	700	18+	Issues created
Medium	1	200	5+	Issue created
Low	3	0	N/A	Not recommended

Success Stories

The analysis identified several existing shared components that are highly successful:

shared/reporting.md - Imported by 40+ workflows, standardizes report formatting
shared/issues-data-fetch.md - Used by 10+ workflows, demonstrates the data-fetch pattern
shared/copilot-session-data-fetch.md - Used by 5+ workflows, extends the data-fetch pattern
shared/python-dataviz.md - Used by 15+ workflows for Python charts
shared/gh.md - Provides authenticated GitHub CLI access via safe-inputs

These components show the value of extraction: reduced duplication, consistent patterns, and easier maintenance.

💡 Detailed Recommendations

Recommendation 1: PR Data Fetch Shared Component

Full Implementation Details

Current State:

Workflows currently fetch PR data with inconsistent approaches:

# Example from copilot-pr-merged-report.md
gh pr list --state merged --search "merged:>=$(date -d '24 hours ago' -Iseconds)"

# Example from pr-triage-agent.md  
# Uses GitHub MCP toolsets: [pull_requests, repos]
# Fetches PRs through MCP tools

Proposed Shared Component:

Create shared/pr-data-fetch.md with:

Cached PR data at /tmp/gh-aw/pr-data/prs.json
Schema generation with jqschema
30-day retention, today's cache key
Fetches: number, title, author, createdAt, state, url, body, labels, updatedAt, closedAt, mergedAt, mergedBy, reviews

Migration Path:

Create shared component following issues-data-fetch.md pattern
Test with copilot-pr-merged-report.md (pilot)
Migrate pr-triage-agent.md
Progressively migrate 8+ other workflows
Document jq filtering patterns

Impact:

Lines saved: ~400
API call reduction: Shared caching reduces redundant fetches
Consistency: All workflows use same PR data structure

Recommendation 2: Workflow Run Data Fetch Shared Component

Full Implementation Details

Current State:

Workflows fetch workflow run data with varied approaches:

# Example from ci-doctor.md
# Uses workflow_run trigger context
# Accesses: github.event.workflow_run.*

# Example from workflow-health-manager.md
gh api "repos/${{ github.repository }}/actions/runs" --paginate

# Example from daily-cli-performance.md
# Custom queries for specific workflow metrics

Proposed Shared Component:

Create shared/workflow-runs-data-fetch.md with:

Cached run data at /tmp/gh-aw/workflow-runs-data/runs.json
Schema generation with jqschema
30-day data window
Fetches: id, name, workflow_id, head_branch, status, conclusion, created_at, html_url

Migration Path:

Create shared component with caching pattern
Test with workflow-health-manager.md (pilot)
Migrate ci-doctor.md (may need adaptation for workflow_run context)
Migrate daily-cli-performance.md
Progressively migrate 5+ other workflows

Impact:

Lines saved: ~300
Consistency: Standardized workflow run data structure
Performance: Shared caching for multiple workflows

Extension Opportunities:

Add job-level data fetching
Pre-fetch logs for failed jobs
Include artifact metadata
Calculate timing and performance stats

Recommendation 3: Discussions Data Fetch Review

Full Implementation Details

Current State:

A file shared/discussions-data-fetch.md already exists but adoption is unclear. Workflows that could benefit:

# Workflows using discussions
toolsets: [default, discussions]  # Appears in 9 workflows

Proposed Action:

Review existing component: Check implementation quality
If well-implemented: Promote adoption, enhance documentation
If needs work: Refactor to match issues-data-fetch.md pattern
Document usage: Add clear examples for filtering discussions

Migration Path:

Verify existing component exists and review code
Enhance documentation with usage examples
Migrate discussion-task-miner.md (pilot)
Migrate daily-issues-report.md
Progressively migrate 3+ other workflows

Impact:

Lines saved: ~200
Adoption increase: Current adoption unclear, could grow to 5-8 workflows
Maintenance: Centralized discussions fetching logic

Note: Lower priority because discussions API is more complex (GraphQL) and fewer workflows need this data compared to PRs or workflow runs.

✅ Created Issues

This analysis has created 3 actionable issues for skill extraction:

Issue #aw_pr_data_fetch: Extract PR Data Fetch into shared component
- Priority: HIGH
- Impact: 10+ workflows, ~400 lines saved
Issue #aw_workflow_runs_fetch: Extract Workflow Run Data Fetch into shared component
- Priority: HIGH
- Impact: 8+ workflows, ~300 lines saved
Issue #aw_discussions_fetch: Extract Discussions Data Fetch into shared component (review existing)
- Priority: MEDIUM
- Impact: 5+ workflows, ~200 lines saved

🎯 Next Steps

Review the created issues and prioritize based on team capacity
Implement PR data fetch component first (highest impact, clearest pattern)
Implement workflow run data fetch second (high impact, addresses monitoring needs)
Review existing discussions fetch component (may just need better documentation)
Monitor for new extraction opportunities as more workflows are added
Schedule next extractor run in 1-2 months to identify new patterns

📚 Methodology

This analysis used the following approach:

Scope:

Analyzed 145 workflow files in .github/workflows/
Reviewed 58 existing shared components in shared/ and shared/mcp/
Sampled 20 diverse workflows in detail for pattern identification

Analysis Technique:

Tool Configuration Analysis: Examined frontmatter for tools:, imports:, safe-outputs:, network: patterns
Prompt Pattern Analysis: Identified repeated instruction blocks and guidelines
Data Processing Analysis: Found common bash scripts, jq queries, and caching patterns
Import Frequency Analysis: Counted import usage to identify successful shared components

Prioritization Criteria:

Frequency: How many workflows use this pattern?
Size: How many lines could be saved?
Maintenance: How often does this pattern change?
Complexity: How difficult would extraction be?
Security: Does sharing affect security boundaries?

Quality Checks:

Cross-referenced existing shared components to avoid recommending what already exists
Verified security implications (network configs, permissions)
Considered workflow-specific customization needs
Identified successful extraction examples (reporting.md, issues-data-fetch.md)

🏆 Success Stories

The analysis identified several highly successful shared components that demonstrate the value of skill extraction:

`shared/reporting.md` - 40+ Imports

Standardizes report formatting with header guidelines, progressive disclosure patterns, and design principles. Reduces 60+ lines per workflow.

`shared/issues-data-fetch.md` - 10+ Imports

Provides cached GitHub issues data with schema generation. Demonstrates the data-fetch pattern that should be replicated for PRs and workflow runs.

`shared/copilot-session-data-fetch.md` - 5+ Imports

Extends the data-fetch pattern for Copilot session analysis. Shows how specialized data sources can follow the same caching pattern.

`shared/python-dataviz.md` - 15+ Imports

Provides Python scientific libraries and chart generation capabilities. Eliminates setup duplication across data visualization workflows.

These components show that well-designed shared skills reduce duplication, improve consistency, and simplify maintenance across the workflow ecosystem.

📊 Key Insights

Strong shared component adoption: 66% of workflows use imports, showing cultural acceptance of shared skills
Successful patterns exist: issues-data-fetch.md and copilot-session-data-fetch.md demonstrate a proven pattern for data fetching components
Clear gaps identified: PR data, workflow run data, and possibly discussions data need standardized fetch components
Security boundaries matter: Network configs and permissions should NOT be over-shared - they represent explicit security choices
Workflow personality is valuable: Custom messages and themed outputs add delight and should remain workflow-specific
Documentation drives adoption: Well-documented shared components (like reporting.md) see wider adoption
Incremental extraction works: The repository shows evidence of gradual skill extraction over time, with newer workflows increasingly using shared components

Analysis Date: 2026-02-04
Analyzer: Workflow Skill Extractor v1.0
Workflows Analyzed: 145
Shared Components Reviewed: 58
Recommendations Generated: 3 actionable issues

AI generated by Workflow Skill Extractor

expires on Feb 12, 2026, 12:06 AM UTC

2026-02-12T01:01:10Z

github-actions[bot]
bot Feb 12, 2026
Author

This discussion was automatically closed because it expired on 2026-02-12T00:06:38.092Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Workflow Skill Extractor] Analysis Report - 145 Workflows Analyzed #13819

Uh oh!

{{title}}

Uh oh!

GitHub Toolsets Configuration

Bash Command Allowlists

Safe-Output Discussion Patterns

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Workflow Skill Extractor] Analysis Report - 145 Workflows Analyzed #13819

Uh oh!

github-actions[bot] bot Feb 5, 2026

Workflow Skill Extractor Report

🎯 Executive Summary

📊 Analysis Overview

Workflows Analyzed

Existing Shared Components

🔍 Identified Skills

High Priority Skills

1. PR Data Fetch Shared Component

2. Workflow Run Data Fetch Shared Component

3. Discussions Data Fetch Shared Component

Medium Priority Skills

4. Report Formatting Guidelines ✅

5. Custom Safe-Output Messages

6. Network Configuration Patterns

Low Priority Skills

GitHub Toolsets Configuration

Bash Command Allowlists

Safe-Output Discussion Patterns

📈 Impact Analysis

By Category

By Priority

Success Stories

💡 Detailed Recommendations

Recommendation 1: PR Data Fetch Shared Component

Recommendation 2: Workflow Run Data Fetch Shared Component

Recommendation 3: Discussions Data Fetch Review

✅ Created Issues

🎯 Next Steps

📚 Methodology

🏆 Success Stories

shared/reporting.md - 40+ Imports

shared/issues-data-fetch.md - 10+ Imports

shared/copilot-session-data-fetch.md - 5+ Imports

shared/python-dataviz.md - 15+ Imports

📊 Key Insights

Replies: 1 comment

Uh oh!

github-actions[bot] bot Feb 12, 2026 Author

github-actions[bot]
bot Feb 5, 2026

`shared/reporting.md` - 40+ Imports

`shared/issues-data-fetch.md` - 10+ Imports

`shared/copilot-session-data-fetch.md` - 5+ Imports

`shared/python-dataviz.md` - 15+ Imports

github-actions[bot]
bot Feb 12, 2026
Author