[Workflow Skill Extractor] Analysis Report - 145 Workflows Analyzed #13819
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-02-12T00:06:38.092Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Workflow Skill Extractor Report
🎯 Executive Summary
This analysis examined 145 agentic workflows and 58 existing shared components to identify opportunities for extracting reusable skills and reducing code duplication. The repository demonstrates strong adoption of shared components with 96 workflows (66%) already using imports, and 40+ workflows leveraging the standardized reporting guidelines.
Key Findings:
shared/reporting.md,shared/issues-data-fetch.md, andshared/copilot-session-data-fetch.mdare success storiesKey Statistics:
📊 Analysis Overview
Workflows Analyzed
The analysis covered the full spectrum of agentic workflows including:
Daily Report Workflows (30+):
daily-issues-report.md,daily-cli-performance.md,daily-code-metrics.mddaily-compiler-quality.md,daily-copilot-token-report.md,daily-firewall-report.mdSecurity & Compliance Workflows (10+):
security-guard.md,security-compliance.md,code-scanning-fixer.mddaily-secrets-analysis.md,daily-malicious-code-scan.md,daily-semgrep-scan.mdTriage & Analysis Workflows (15+):
issue-classifier.md,auto-triage-issues.md,issue-triage-agent.mdpr-triage-agent.md,ci-doctor.md,copilot-pr-merged-report.mdMeta-Orchestrator Workflows (8+):
workflow-health-manager.md,workflow-normalizer.md,mcp-inspector.mdTest & Smoke Workflows (8+):
smoke-copilot.md,smoke-claude.md,smoke-codex.md,smoke-opencode.mdExisting Shared Components
The repository already has 58 shared components organized into categories:
Core Shared Components (37 files in
shared/):issues-data-fetch.md,copilot-session-data-fetch.md,pr-data-safe-input.md,weekly-issues-data-fetch.md,copilot-pr-data-fetch.md,discussions-data-fetch.md,github-queries-safe-input.mdpython-dataviz.md,charts-with-trending.md,trends.md,trending-charts-simple.md,session-analysis-charts.mdreporting.md(40+ imports),keep-it-short.md,use-emojis.mdci-data-analysis.md,ci-optimization-strategies.md,metrics-patterns.md,session-analysis-strategies.md,token-cost-analysis.mdgh.md,jqschema.md,go-make.md,genaiscript.md,sq.mdapp-config.md,changeset-format.md,github-mcp-app.md,safe-output-app.md,docs-server-lifecycle.mdopencode.md,ollama-threat-scan.md,actions-ai-inference.mdsecret-redaction-test.mdmcp-debug.md,mcp-pagination.md,ffmpeg.mdMCP Server Components (21 files in
shared/mcp/):arxiv.md,ast-grep.md,azure.md,brave.md,chroma.md,context7.mddatadog.md,deepwiki.md,drain3.md,fabric-rti.md,jupyter.mdmarkitdown.md,microsoft-docs.md,notion.md,semgrep.md,sentry.mdserver-memory.md,skillz.md,slack.md,svelte.md,tavily.md🔍 Identified Skills
High Priority Skills
1. PR Data Fetch Shared Component
Frequency: Used in 10+ workflows
Size: ~400 lines could be saved
Priority: HIGH
Description: Standardize GitHub pull request data fetching with caching pattern
Currently, workflows like
copilot-pr-merged-report.md,pr-triage-agent.md,security-guard.md, and others each implement their own PR data fetching logic. This leads to inconsistent data structures and duplicated API calls.Workflows Using This Pattern:
copilot-pr-merged-report.md- Fetches merged PRs from last 24 hourspr-triage-agent.md- Fetches PRs for triagecopilot-pr-nlp-analysis.md- Analyzes PR text contentcopilot-pr-prompt-analysis.md- Analyzes PR promptssecurity-guard.md- Analyzes PR security posturecopilot-agent-analysis.md- Analyzes Copilot-created PRssecurity-compliance.md- Checks PR security compliancepr-nitpick-reviewer.md- Reviews PRs for nitpicksdraft-pr-cleanup.md- Cleans up stale draft PRsgrumpy-reviewer.md- Reviews PRs with critical eyeRecommendation: Create
shared/pr-data-fetch.mdfollowing the successful pattern ofshared/issues-data-fetch.mdandshared/copilot-session-data-fetch.md.GitHub Issue Created: #aw_pr_data_fetch
2. Workflow Run Data Fetch Shared Component
Frequency: Used in 8+ workflows
Size: ~300 lines could be saved
Priority: HIGH
Description: Standardize GitHub Actions workflow run data fetching with caching
Workflows that monitor, analyze, or report on GitHub Actions workflow runs currently duplicate the logic for querying the Actions API, processing run data, and caching results.
Workflows Using This Pattern:
ci-doctor.md- Investigates failed CI workflow runsdaily-cli-performance.md- Analyzes CLI compilation performanceworkflow-health-manager.md- Monitors health of all workflowsdaily-firewall-report.md- Reports on firewall configurationsdaily-observability-report.md- Generates observability metricsmetrics-collector.md- Collects workflow execution metricsrepo-audit-analyzer.md- Audits repository configurationsdev-hawk.md- Monitors development workflow patternsRecommendation: Create
shared/workflow-runs-data-fetch.mdwith caching strategy similar to existing data-fetch components.GitHub Issue Created: #aw_workflow_runs_fetch
3. Discussions Data Fetch Shared Component
Frequency: Used in 5+ workflows
Size: ~200 lines could be saved
Priority: MEDIUM
Description: Standardize GitHub Discussions data fetching (may already exist)
Note: The file
shared/discussions-data-fetch.mdalready exists! This recommendation is to verify its implementation quality and promote its adoption across workflows that currently implement their own discussions fetching logic.Workflows That Could Benefit:
discussion-task-miner.md- Mines discussions for tasksdaily-issues-report.md- Includes discussions in reportsweekly-issue-summary.md- Summarizes issues and discussionsdaily-fact.md- Posts to discussionsauto-triage-issues.md- Creates discussion reportsRecommendation: Review existing
shared/discussions-data-fetch.mdand enhance documentation if needed. Promote adoption across workflows.GitHub Issue Created: #aw_discussions_fetch
Medium Priority Skills
4. Report Formatting Guidelines ✅
Status: ALREADY IMPLEMENTED as
shared/reporting.mdFrequency: Imported by 40+ workflows
Impact: SUCCESS STORY
This is an excellent example of successful skill extraction! The
shared/reporting.mdcomponent provides:<details><summary>tagsNo action needed - this demonstrates the value of shared components!
5. Custom Safe-Output Messages
Frequency: 40+ workflows define custom messages
Priority: LOW
Description: Themed messages for workflow personality
Many workflows define creative, themed custom messages:
ci-doctor.md: Medical theme (🩺 🏥 💊)security-guard.md: Security theme (🛡️ 🔒)daily-fact.md: Poetic theme (🪶 📜 ✨)brave.md: Search theme (🦁 🔍)Recommendation: DO NOT EXTRACT - Message customization is part of workflow personality and should remain workflow-specific. Standardizing messages would reduce the delight and character of individual workflows.
6. Network Configuration Patterns
Frequency: 66+ workflows define network permissions
Priority: LOW
Description: Common network allowlists for GitHub API access
Recommendation: DO NOT EXTRACT - Network configurations represent security boundaries and should be explicitly declared per workflow. Over-sharing could obscure critical security requirements.
Low Priority Skills
View Additional Patterns Considered
GitHub Toolsets Configuration
Pattern:
toolsets: [default]appears in 39 workflowsAssessment: Already minimal (2-3 lines). Low value for extraction.
Bash Command Allowlists
Pattern:
bash: ["jq *", "gh api *", "date *", "mkdir *", "cp *"]Assessment: Too workflow-specific. Each workflow has unique bash command needs.
Safe-Output Discussion Patterns
Pattern:
create-discussionwithclose-older-discussions: trueAssessment: Only 4-5 lines and varies significantly (category, title-prefix, expires).
📈 Impact Analysis
By Category
By Priority
Success Stories
The analysis identified several existing shared components that are highly successful:
shared/reporting.md- Imported by 40+ workflows, standardizes report formattingshared/issues-data-fetch.md- Used by 10+ workflows, demonstrates the data-fetch patternshared/copilot-session-data-fetch.md- Used by 5+ workflows, extends the data-fetch patternshared/python-dataviz.md- Used by 15+ workflows for Python chartsshared/gh.md- Provides authenticated GitHub CLI access via safe-inputsThese components show the value of extraction: reduced duplication, consistent patterns, and easier maintenance.
💡 Detailed Recommendations
Recommendation 1: PR Data Fetch Shared Component
Full Implementation Details
Current State:
Workflows currently fetch PR data with inconsistent approaches:
Proposed Shared Component:
Create
shared/pr-data-fetch.mdwith:/tmp/gh-aw/pr-data/prs.jsonMigration Path:
Impact:
Recommendation 2: Workflow Run Data Fetch Shared Component
Full Implementation Details
Current State:
Workflows fetch workflow run data with varied approaches:
Proposed Shared Component:
Create
shared/workflow-runs-data-fetch.mdwith:/tmp/gh-aw/workflow-runs-data/runs.jsonMigration Path:
Impact:
Extension Opportunities:
Recommendation 3: Discussions Data Fetch Review
Full Implementation Details
Current State:
A file
shared/discussions-data-fetch.mdalready exists but adoption is unclear. Workflows that could benefit:Proposed Action:
Migration Path:
Impact:
Note: Lower priority because discussions API is more complex (GraphQL) and fewer workflows need this data compared to PRs or workflow runs.
✅ Created Issues
This analysis has created 3 actionable issues for skill extraction:
Issue #aw_pr_data_fetch: Extract PR Data Fetch into shared component
Issue #aw_workflow_runs_fetch: Extract Workflow Run Data Fetch into shared component
Issue #aw_discussions_fetch: Extract Discussions Data Fetch into shared component (review existing)
🎯 Next Steps
📚 Methodology
This analysis used the following approach:
Scope:
.github/workflows/shared/andshared/mcp/Analysis Technique:
tools:,imports:,safe-outputs:,network:patternsPrioritization Criteria:
Quality Checks:
🏆 Success Stories
The analysis identified several highly successful shared components that demonstrate the value of skill extraction:
shared/reporting.md- 40+ ImportsStandardizes report formatting with header guidelines, progressive disclosure patterns, and design principles. Reduces 60+ lines per workflow.
shared/issues-data-fetch.md- 10+ ImportsProvides cached GitHub issues data with schema generation. Demonstrates the data-fetch pattern that should be replicated for PRs and workflow runs.
shared/copilot-session-data-fetch.md- 5+ ImportsExtends the data-fetch pattern for Copilot session analysis. Shows how specialized data sources can follow the same caching pattern.
shared/python-dataviz.md- 15+ ImportsProvides Python scientific libraries and chart generation capabilities. Eliminates setup duplication across data visualization workflows.
These components show that well-designed shared skills reduce duplication, improve consistency, and simplify maintenance across the workflow ecosystem.
📊 Key Insights
Strong shared component adoption: 66% of workflows use imports, showing cultural acceptance of shared skills
Successful patterns exist:
issues-data-fetch.mdandcopilot-session-data-fetch.mddemonstrate a proven pattern for data fetching componentsClear gaps identified: PR data, workflow run data, and possibly discussions data need standardized fetch components
Security boundaries matter: Network configs and permissions should NOT be over-shared - they represent explicit security choices
Workflow personality is valuable: Custom messages and themed outputs add delight and should remain workflow-specific
Documentation drives adoption: Well-documented shared components (like
reporting.md) see wider adoptionIncremental extraction works: The repository shows evidence of gradual skill extraction over time, with newer workflows increasingly using shared components
Analysis Date: 2026-02-04
Analyzer: Workflow Skill Extractor v1.0
Workflows Analyzed: 145
Shared Components Reviewed: 58
Recommendations Generated: 3 actionable issues
Beta Was this translation helpful? Give feedback.
All reactions