Flakiness is the product.
A flakiness-first test reporting system that aggregates Playwright reporter output and produces metrics for all test runs. Built to answer: "What is broken in our test suite?" rather than just "What failed on this run?"
This tool aggregates Playwright reporter output to identify and prioritize flaky tests. It runs on Cloudflare Workers, storing raw data in R2 and metrics in Durable Objects (SQLite).
The easiest way to get started is by deploying to Cloudflare:
- Click the Deploy to Cloudflare Workers button above.
- Follow the prompts to connect your GitHub account and deploy.
- Once deployed, you'll have a URL for your metrics dashboard (e.g.,
https://playwright-metrics.your-subdomain.workers.dev).
To track metrics, you need to upload Playwright JSON reports to your server.
- Reporter: Use the built-in JSON reporter in your Playwright project.
// playwright.config.ts export default defineConfig({ reporter: [['json', { outputFile: 'playwright-report/report.json' }]], });
- Upload Script: A script to POST the JSON report to the
/upload/endpoint.
We provide a specialized upload script in upload-playwright-report.tsx.
To use it in your project, check the example script for usage instructions and required environment variables.
We recommend protecting your metrics dashboard using Cloudflare Zero Trust.
- Enable Cloudflare Access for your Worker's domain.
- Create an Access Application for your dashboard URL.
- Define Policies to restrict access (e.g., only allow users from your organization).
Wait, what about the /upload/ endpoint? It needs to be bypassable by your CI/CD pipeline. You can do this by creating an Access Service Token for your CI runner and adding a policy to allow it.
flaky_rate = flaky_runs / total_runs
A flaky_run failed at least once but passed on retry in the same run.
- Highest flaky_rate
- Highest flaky_runs
- Highest total_runs
This surfaces the most unstable tests first.