📊 Agentic Workflow Lock File Statistics - February 2026 #14079
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-02-13T08:32:30.896Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Comprehensive statistical analysis of 145 agentic workflow lock files in the
github/gh-awrepository, revealing usage patterns, popular triggers, structural characteristics, and configuration trends.Key Metrics:
File Size Distribution
Size Extremes:
codex-github-remote-mcp-test.lock.yml(22 KB)smoke-claude.lock.yml(106 KB)Key Insight: The vast majority (69%) of lock files fall into the 50-70 KB range, indicating consistent workflow complexity across the repository.
Trigger Analysis
Most Popular Triggers
Common Trigger Combinations
Schedule + Manual (
schedule,workflow_dispatch): 95 workflows (65.5%)Manual Only (
workflow_dispatch): 19 workflows (13.1%)Multi-Source (
pull_request,schedule,workflow_dispatch): 6 workflows (4.1%)Interactive Multi-Trigger (all event types): 3 workflows (2.1%)
Pure Event-Driven (
issues,issue_comment, etc.): Various countsSchedule Patterns
View Detailed Schedule Distribution
0 13 * * 1-50 14 * * 1-50 11 * * 1-50 10 * * 1-50 9 * * 1-50 15 * * 1-50 16 * * 1-50 7 * * 1-55 12 * * *31 */12 * * *Pattern Insight: Schedules are deliberately staggered throughout business hours (7 AM - 4 PM UTC) to distribute load and avoid resource contention. Most workflows run on weekdays only.
Safe Outputs Analysis
Safe outputs enable workflows to create GitHub resources (discussions, issues, comments) in a controlled manner.
noopmissing_toolmissing_dataadd-commentKey Findings:
Transparency First: The
noopsafe output is the most common (1,243 uses), indicating workflows frequently log completion status even when no changes are needed. This ensures visibility into workflow execution.Limitations Reporting:
missing_tool(1,108) andmissing_data(414) are heavily used to report capability gaps and data unavailability, providing valuable feedback about workflow constraints.Controlled Interactions: Only 90 uses of
add-commentshows disciplined use of GitHub API mutations, preventing spam and maintaining clean issue/PR threads.Discussion Category: When creating discussions, workflows primarily target the "audits" category for reports and analysis results.
Structural Characteristics
Job Complexity
firewall-escape.lock.yml)Typical Job Structure:
Step Complexity
daily-copilot-token-report.lock.yml)Step Distribution Pattern:
Permission Patterns
Most Common Permissions
Permission Distribution Insights:
Read-Heavy Contents Access: 652 read vs. 74 write for
contentspermission indicates workflows primarily analyze code rather than modify it. Write access is reserved for specific workflows that need to commit changes.Issue Management: 314 write permissions for
issues(vs. 131 read) shows active issue creation and management, likely for reporting, triage, and automation.Pull Request Engagement: 240 write permissions for
pull-requestsindicates workflows actively comment on, review, or create PRs.Discussion Creation: 270 write permissions for
discussionsaligns with the repository's emphasis on using discussions for audit reports and analysis results.Minimal Permissions: All workflows follow the principle of least privilege, requesting only necessary permissions for their specific job steps.
MCP Server Usage
MCP (Model Context Protocol) servers provide specialized capabilities to agentic workflows.
Findings:
GitHub-Centric: The
githubMCP server dominates with 35 uses, reflecting workflows' primary focus on repository analysis, code review, and GitHub resource management.Web Automation: Playwright MCP server (5 uses) enables workflows to interact with web UIs, test web applications, or gather data from web sources.
Specialized Research: Arxiv and deepwiki MCP servers show experimental use of specialized knowledge sources, potentially for research-oriented workflows.
Interesting Findings
High Manual Trigger Adoption (88.3%)
workflow_dispatch, enabling on-demand executionScheduled Workflow Dominance (71.7%)
Consistent File Size (69% in 50-70 KB range)
High Step Count (avg 71.6 steps)
Safe Output Discipline
noop(1,243) demonstrates commitment to transparencymissing_tool/missing_datacounts (1,522 combined) show workflows gracefully handle limitationsadd-commentcount (90) prevents notification spamMulti-Job Architecture (avg 6 jobs)
Weekday-Only Schedules
1-5in cron)Minimal Write Permissions
contents: 652 vs 74)Statistical Profile: The "Typical" Agentic Workflow
Based on median and average values, a typical
.lock.ymlfile in this repository has:schedule+workflow_dispatch(65.5% use this combo)contents: readissues: writepull-requests: readdiscussions: writenoopfor transparency, occasionaladd-commentRecommendations
Based on the analysis, here are actionable recommendations:
Template Standardization
Schedule Distribution
Safe Output Expansion
create-pull-requestandcreate-issuesafe outputs if not already availableadd-commentcount suggests conservative use - maintain this disciplineMCP Server Adoption
Permission Optimization
contents) is idealDocumentation
Monitoring
Historical Analysis
Methodology
Analysis Tools: Bash scripts and Python 3 with regex-based YAML parsing (no external dependencies)
Lock Files Analyzed: 145
Cache Memory: Used
/tmp/gh-aw/cache-memory/for script persistence and historical data trackingData Sources: All
.lock.ymlfiles in.github/workflows/directoryAnalysis Scripts:
/tmp/gh-aw/cache-memory/scripts/analyze_lockfiles.sh- Bash-based extraction/tmp/gh-aw/cache-memory/scripts/comprehensive_analysis_v2.py- Python statistical aggregationVerification: Cross-referenced multiple data extraction methods to ensure accuracy
References:
Beta Was this translation helpful? Give feedback.
All reactions