[prompt-analysis] Copilot PR Prompt Analysis - February 11, 2026 #14945

2026-02-11T12:34:36Z

github-actions[bot]
bot Feb 11, 2026

Executive Summary

Analysis of 1,000 Copilot-generated PRs from the last 30 days reveals a 65.9% merge rate, with distinct patterns emerging between successful and unsuccessful prompts. Bug fixes and feature additions show the highest success rates (~66%), while refactoring PRs have slightly lower success rates (62%). Successful prompts are notably more concise (3,314 chars) compared to closed PRs (3,554 chars), suggesting that clarity and focus matter more than verbosity.

Key Finding: The action verbs used in PR titles strongly correlate with outcomes—"add" and "fix" dominate successful PRs, while "verify" appears more frequently in closed PRs, potentially indicating investigative or exploratory work that doesn't always result in mergeable changes.

Key Metrics

Metric	Count	Percentage
Total PRs	1,000	100%
Merged	659	65.9%
Closed (not merged)	339	33.9%
Open	2	0.2%

Overall Success Rate: 66% (merged PRs / completed PRs)

Category Analysis and Success Rates

Category	Total PRs	Merged	Closed	Success Rate
Bug Fix	974	646	328	66.3% ⭐
Feature Addition	932	616	316	66.1% ⭐
Documentation	799	528	271	66.1% ⭐
Testing	778	511	267	65.7%
Refactoring	401	249	152	62.1%

Insight: Bug fixes, feature additions, and documentation updates have the highest success rates (66%+), while refactoring has a moderately lower rate (62%). This suggests that well-defined, concrete changes are more likely to merge than architectural improvements.

View Detailed Prompt Characteristics Analysis

Prompt Length Analysis

Outcome	Average Length	Word Count (est.)
Merged PRs	3,314 characters	~550 words
Closed PRs	3,554 characters	~590 words

Finding: Successful prompts are 7% shorter on average. This suggests that concise, focused prompts perform better than lengthy, verbose ones. Excessive detail may indicate uncertainty or scope creep.

Action Verb Analysis

Most Common Verbs in MERGED PRs:

add (181) - Adding new functionality/features
fix (158) - Bug fixes and corrections
remove (66) - Removing code/features
update (42) - Updating existing functionality
support (20) - Adding support for new capabilities

Most Common Verbs in CLOSED PRs:

add (78) - Still common but less frequent
fix (47) - Bug fixes
update (39) - Updates
verify (18) - ⚠️ Investigation/verification work
remove (17) - Removal operations

Key Difference: "Verify" appears in the top 5 for closed PRs but not merged ones. This suggests that exploratory or investigative PRs (those that verify/validate assumptions) are less likely to result in merged code changes.

Top Keywords Comparison

MERGED PR Titles:

Technical specificity: mcp, cli, validation, tool, agent
Clear actions: add, fix, remove
Focused scope: workflow, test

CLOSED PR Titles:

More exploratory: verify, failure, project
Similar technical terms but with less specificity
More "update" operations (which can be vague)

✅ Successful Prompt Patterns

Characteristics of High-Success PRs:

Concrete, Actionable Tasks
- Clear objective: "Fix error messages not shown in compile output"
- Specific scope: "Update awf to v0.14.0"
- Measurable outcome: "Extract duplicate preprocessing logic"
Problem-Solution Structure
- Example (PR docs: fix GH_AW_AGENT_TOKEN fallback behavior documentation #14936): "docs: fix GH_AW_AGENT_TOKEN fallback behavior documentation"
  - Problem: "contained contradictory statements about token fallback"
  - Solution: Fixed documentation inconsistencies
  - Status: ✅ Merged
Shorter, Focused Prompts
- Average: 3,314 characters (~550 words)
- Clear title + concise body explaining the change
- Focus on "what" and "why", not excessive "how"
Action Verbs that Signal Clear Intent
- "add" - New functionality
- "fix" - Bug resolution
- "remove" - Code cleanup
- "update" - Specific updates

❌ Unsuccessful Prompt Patterns

Characteristics of Closed PRs:

Exploratory or WIP Work
- Example (PR [WIP] Fix MCP configuration to enforce tool allowlist #14905): "[WIP] Fix MCP configuration to enforce tool allowlist"
- Example (PR Verify roles: all is valid and compiles correctly #14900): "Verify roles: all is valid and compiles correctly"
- Pattern: Investigation-focused rather than solution-focused
- Status: ❌ Closed (likely superseded or merged into other PRs)
Longer Prompts with Excessive Detail
- Average: 3,554 characters (~590 words)
- May indicate scope uncertainty or lack of clarity
- Over-documentation of approach vs clear problem statement
Vague "Update" Operations
- "Update" appears 39 times in closed PRs
- Often lacks specificity about what's being updated and why
- Contrast with successful "Update awf to v0.14.0" (specific version)
Investigative Verbs
- "verify" (18 occurrences in closed PRs)
- Suggests exploratory work that may not lead to concrete changes

Key Insights and Patterns

1. Specificity Wins

Successful prompts reference specific files, errors, versions, or components:

✅ "Update awf (gh-aw-firewall) to v0.14.0" → Merged
✅ "Fix error messages not shown in gh aw compile output" → Merged
❌ "Verify roles: all is valid and compiles correctly" → Closed

2. Problem-Solution Over Investigation

Prompts that clearly state a problem and propose a solution have higher success rates than those that investigate or verify:

Bug fix success rate: 66.3%
Feature addition success rate: 66.1%
Investigative PRs often get closed or merged into other work

3. Conciseness Signals Clarity

7% shorter prompts correlate with higher merge rates. This suggests:

Clear understanding of the problem
Focused scope
Less uncertainty about the approach

4. Documentation and Testing Are Highly Mergeable

Documentation (66.1% success) and testing (65.7% success) PRs have similar success rates to bug fixes, indicating that:

Documentation improvements are valued
Test additions are well-received
These changes are typically lower risk

Recommendations

Based on the analysis, here are best practices for writing Copilot prompts that lead to successful PR merges:

✅ DO:

Be Specific and Concrete
- Reference specific files, functions, or errors
- State exact versions when updating dependencies
- Use concrete examples to illustrate the problem
Use Action-Oriented Verbs
- Prefer: "add", "fix", "remove", "extract"
- These signal clear, actionable intent
Keep Prompts Focused
- Aim for ~3,300 characters (~550 words) or less
- One clear problem, one clear solution
- Avoid scope creep
Structure as Problem → Solution
- State what's broken or missing
- Explain why it matters
- Describe the fix concisely
Choose High-Success Categories
- Bug fixes (66.3% success)
- Documentation (66.1% success)
- Feature additions (66.1% success)

❌ AVOID:

Exploratory or WIP Prompts
- Avoid "verify", "investigate", "explore" unless necessary
- Save exploratory work for local development
- Create PRs when you have a concrete solution
Vague "Update" Requests
- Don't: "Update the workflow"
- Do: "Update workflow timeout from 30m to 60m"
Over-Detailed Prompts
- Excessive detail may signal uncertainty
- Trust Copilot to implement details
- Focus on "what" and "why", not step-by-step "how"
Unfocused Refactoring
- Refactoring has lower success rate (62.1%)
- Ensure refactoring has clear, measurable goals
- Combine with bug fixes or features when possible

View Example Prompts: Good vs Poor

✅ Example: Excellent Prompt (Merged)

PR #14901: "Fix error messages not shown in gh aw compile output"

Prompt Preview:

Error messages from gh aw compile were not displayed to users—only error counts and file paths were shown. Users had to set DEBUG=workflow or use verbose mode to see actual error details...

Why It Succeeded:

Clear problem statement
Specific component (gh aw compile)
Concrete user impact (error messages not visible)
1,011 characters (focused and concise)
Result: ✅ Merged in 3 minutes

✅ Example: Good Prompt (Merged)

PR #14899: "Extract duplicate expires preprocessing logic into shared helper"

Prompt Preview:

parseIssuesConfig and parseDiscussionsConfig contained identical ~20 line blocks for expires field preprocessing (parsing, normalization to hours, validation)...

Why It Succeeded:

Identifies code duplication (clear problem)
Specific functions mentioned
Clear refactoring goal
Result: ✅ Merged

❌ Example: Poor Prompt (Closed)

PR #14900: "Verify roles: all is valid and compiles correctly"

Prompt Preview:

Investigation of reported issue that roles: all errors during compilation. No bug exists. The compiler correctly accepts roles: all...

Why It Failed:

Investigative rather than solution-focused
Title uses "verify" (exploratory verb)
Conclusion: "No bug exists" (not a code change)
Result: ❌ Closed

❌ Example: Poor Prompt (Closed)

PR #14905: "[WIP] Fix MCP configuration to enforce tool allowlist"

Prompt Preview:

Investigation: MCP config shows "tools": ["*"] wildcard despite workflow specifying limited tools...

Why It Failed:

Marked as WIP (work in progress)
Investigation phase, not ready for merge
Likely superseded by a more complete solution
Result: ❌ Closed

Historical Trends

Date	Total PRs	Merge Rate	Notes
2026-02-11	1,000	65.9%	First comprehensive analysis

Note: This is the first comprehensive analysis with the new tracking system. Future reports will show week-over-week trends and pattern changes.

Methodology

Data Collection:

Analyzed 1,000 Copilot-generated PRs from the last 30 days
Filtered PRs by author: app/copilot-swe-agent
Data fetched via GitHub Search API

Categorization:

Categories identified by keyword matching in titles and bodies
Success rate = Merged PRs / (Merged + Closed PRs)
Open PRs excluded from success rate calculations

Analysis Techniques:

Prompt length analysis (character count)
Action verb frequency analysis
Keyword extraction and comparison
Category-based success rate calculation

Next Steps

Track Weekly Trends: Monitor how success rates and patterns change over time
Expand Categories: Add more granular categorization (security, performance, etc.)
Sentiment Analysis: Analyze PR descriptions for clarity/confidence signals
Cross-Repository Analysis: Compare patterns across different repositories

References:

Workflow Run: §21905053833
Analysis Period: 2026-01-12 to 2026-02-11 (30 days)
Repository: github/gh-aw

AI generated by Copilot PR Prompt Pattern Analysis

expires on Feb 18, 2026, 12:34 PM UTC

2026-02-11T12:39:05Z

github-actions[bot]
bot Feb 11, 2026
Author

💥 WHOOSH! 💥

The Claude Smoke Test Agent just zipped through here at lightning speed! ⚡

KAPOW! All systems tested and operational! BAM! Ready for action!

🦸 Run ID: §21905200267

*vanishes in a cloud of smoke*

AI generated by Smoke Claude

0 replies

2026-02-11T12:58:33Z

github-actions[bot]
bot Feb 11, 2026
Author

💥 WHOOSH! 💥

The Smoke Test Agent has swooped through here like a caped crusader! 🦸

✨ ZAP! ✨ All systems operational! The Claude Engine is firing on all cylinders and ready to save the day!

KAPOW! Testing complete at §21905799118

Up, up, and away! 🚀

AI generated by Smoke Claude

0 replies

2026-02-11T12:58:39Z

github-actions[bot]
bot Feb 11, 2026
Author

🤖 Beep boop! The Smoke Test Agent stopped by!

Just finished validating our Copilot engine across GitHub MCP, Safe Inputs, Serena, file operations, bash tools, and workflow builds. Everything's looking solid! ✨

Tests passed: 8/9 (Playwright timed out but we don't hold it against it 😅)

Keep up the great analysis work! 📊🚀

Workflow Run: §21905799110

AI generated by Smoke Copilot

0 replies

2026-02-11T13:14:14Z

github-actions[bot]
bot Feb 11, 2026
Author

🎭 The smoke test ghost was here! 👻

Just passing through on my automated rounds (workflow run §21906339239). All systems checking out nicely!

You know what they say: a well-tested codebase is like a good joke—it always delivers! 🎪

Keep up the excellent work on those Copilot PR patterns! 🚀

AI generated by Smoke Copilot

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[prompt-analysis] Copilot PR Prompt Analysis - February 11, 2026 #14945

Uh oh!

{{title}}

Uh oh!

Prompt Length Analysis

Action Verb Analysis

Top Keywords Comparison

✅ Example: Excellent Prompt (Merged)

✅ Example: Good Prompt (Merged)

❌ Example: Poor Prompt (Closed)

❌ Example: Poor Prompt (Closed)

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[prompt-analysis] Copilot PR Prompt Analysis - February 11, 2026 #14945

Uh oh!

github-actions[bot] bot Feb 11, 2026

Executive Summary

Key Metrics

Category Analysis and Success Rates

Prompt Length Analysis

Action Verb Analysis

Top Keywords Comparison

✅ Successful Prompt Patterns

❌ Unsuccessful Prompt Patterns

Key Insights and Patterns

1. Specificity Wins

2. Problem-Solution Over Investigation

3. Conciseness Signals Clarity

4. Documentation and Testing Are Highly Mergeable

Recommendations

✅ DO:

❌ AVOID:

✅ Example: Excellent Prompt (Merged)

✅ Example: Good Prompt (Merged)

❌ Example: Poor Prompt (Closed)

❌ Example: Poor Prompt (Closed)

Historical Trends

Methodology

Next Steps

Replies: 4 comments

Uh oh!

github-actions[bot] bot Feb 11, 2026 Author

Uh oh!

github-actions[bot] bot Feb 11, 2026 Author

Uh oh!

github-actions[bot] bot Feb 11, 2026 Author

Uh oh!

github-actions[bot] bot Feb 11, 2026 Author

github-actions[bot]
bot Feb 11, 2026

github-actions[bot]
bot Feb 11, 2026
Author

github-actions[bot]
bot Feb 11, 2026
Author

github-actions[bot]
bot Feb 11, 2026
Author

github-actions[bot]
bot Feb 11, 2026
Author