Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scripts: add script to measure percentage of commits with failed status #13175

Merged
merged 2 commits into from
Apr 5, 2022

Conversation

karuppiah7890
Copy link
Contributor

@karuppiah7890 karuppiah7890 commented Jul 2, 2021

This is to start measuring the test flakiness and see the numbers improving once we improve and deflake flaky tests
This PR is a part of the fix for the issue #13167

What this PR does -

It tries to get the last 100 etcd commits and their commit status data from the GitHub GraphQL API and count the number of failures and shows the failure percentage

It uses the following tools

  • bash
  • curl
  • jq

What this PR does not do -

  • This PR does not introduce any GitHub Actions Workflow to run the script automatically weekly once as mentioned in Deflake etcd tests #13167 . Let me know if we need to do it as part of this PR, I could add a GitHub Action Workflow config Yaml for it with cron schedule - This has been added
  • This PR does not analyze PRs of the commits with failure status and identify flaky tests

Secrets

This PR introduces a script which needs a GitHub Token to access the GitHub API and get etcd repo commit data

Documentation

This script does not have any documentation as of now. Let me know if I need to add any details in some docs. Like what it does, it's purpose in some README and any other details

Questions

  • Should we consider commits that don't have any status (multiple commits in a PR)? That is they have null status. Should we count them as part of the last 100 commits? The 100 commits currently have three statuses - success, null, failure. We count failures and then do calculation and say that's the percentage, so it assumes that the ones with null are all successes
  • Should we add documentation for this script? If yes, where should we add it?
  • Should we parameterize the number of commits we want to choose from the history? Currently it's hard coded as 100 last commits
  • Does the script name look okay? Let me know if I need to change it
  • The script creates a JSON file to dump all the GitHub API data about the repo commits and then does processing on it. Is that okay? Also, I have git ignored that file too
  • Does the message Commit status failure percentage is 31 % make sense? Or we want to give some other message talking about test flayness directly?
  • Is the commit message okay?
  • I put Fixes #13167 in the commit message but this is only part of fixing Deflake etcd tests #13167. I think it might close Deflake etcd tests #13167 with the merge of this PR. If it does, we can reopen the issue, yeah?

@karuppiah7890
Copy link
Contributor Author

cc @serathius

@karuppiah7890 karuppiah7890 force-pushed the issue-13167-measure-flakyness branch from f1e8f27 to b447aaf Compare July 3, 2021 05:24
@karuppiah7890
Copy link
Contributor Author

There was a failure in one of the GitHub Actions Workflows - https://github.com/etcd-io/etcd/runs/2977019406 for the shellcheck and I fixed it

% 'shellcheck' '-fgcc' 'build' 'test' 'scripts/build-release.sh' 'scripts/codecov_upload.sh' 'scripts/fix.sh' 'scripts/genproto.sh' 'scripts/install-marker.sh' 'scripts/measure-test-flakyness.sh' 'scripts/release_mod.sh' 'scripts/test_lib.sh' 'scripts/update_dep.sh' 'scripts/updatebom.sh' './build.sh' './test.sh'
FAIL: (code:1):
  % 'shellcheck' '-fgcc' 'build' 'test' 'scripts/build-release.sh' 'scripts/codecov_upload.sh' 'scripts/fix.sh' 'scripts/genproto.sh' 'scripts/install-marker.sh' 'scripts/measure-test-flakyness.sh' 'scripts/release_mod.sh' 'scripts/test_lib.sh' 'scripts/update_dep.sh' 'scripts/updatebom.sh' './build.sh' './test.sh'
FAIL: 'run shellcheck -fgcc build test scripts/build-release.sh scripts/codecov_upload.sh scripts/fix.sh scripts/genproto.sh scripts/install-marker.sh scripts/measure-test-flakyness.sh scripts/release_mod.sh scripts/test_lib.sh scripts/update_dep.sh scripts/updatebom.sh ./build.sh ./test.sh' checking failed (!=0 return code)
scripts/measure-test-flakyness.sh:19:26: note: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead. [SC2002]
FAIL: 'shellcheck' failed at Sat Jul  3 01:55:04 UTC 2021

Error - scripts/measure-test-flakyness.sh:19:26: note: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead. [SC2002]

I removed the usage of cat now

@serathius
Copy link
Member

Looks good, but I would reconsider writing to local file. Leaving trash in current directory is a bad practice as it can be easily avoided. Could you try writing a file? Some ideas:

  • Store in environment variable - response is limited to 100 commits so I don't expect that storing it on disk is required
  • Create a tmp file using mktemp

@karuppiah7890
Copy link
Contributor Author

Makes sense 👍 I'll write to a temp file and I'll also remove the git ignore too in that case

@karuppiah7890 karuppiah7890 force-pushed the issue-13167-measure-flakyness branch from b447aaf to 354de1b Compare July 6, 2021 15:28
@karuppiah7890
Copy link
Contributor Author

@serathius I made the changes and wrote to a /tmp directory JSON file

Unrelated question - By the way, is trash the word that's normally used in this context of files that one wouldn't want to keep in the repo? I have noticed other folks use it too, like - trash, garbage. I was thinking words like unnecessary or unwanted in this context

@karuppiah7890 karuppiah7890 force-pushed the issue-13167-measure-flakyness branch from 354de1b to 2d47ff7 Compare July 6, 2021 19:13
@karuppiah7890
Copy link
Contributor Author

Oops. Another shellcheck issue that I didn't check locally 🙈

https://github.com/etcd-io/etcd/pull/13175/checks?check_run_id=3001649934

Fixed it

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

We could consider cleaning up the temp directory on bash exit.

trap '{ rm -f -- "$temp_dir"; }' EXIT

@karuppiah7890
Copy link
Contributor Author

karuppiah7890 commented Jul 8, 2021

Makes sense @serathius ! I was thinking about that too, since it's a temporary thing, but then later felt that it's a small file and is in /tmp and would get deleted automatically at some point? But sure, I'll try to put that in to ensure that the temp file and directory is deleted. I just tried out trap for the first time! Interesting! :D Takes care of both success and failure cases to ensure that the temp file is removed regardless. Didn't know about it before

@karuppiah7890 karuppiah7890 force-pushed the issue-13167-measure-flakyness branch from 2d47ff7 to d09f928 Compare July 8, 2021 17:19
@karuppiah7890
Copy link
Contributor Author

@karuppiah7890 karuppiah7890 force-pushed the issue-13167-measure-flakyness branch 2 times, most recently from d0855fa to 75a27fb Compare July 8, 2021 17:23
@karuppiah7890
Copy link
Contributor Author

@serathius As part of this PR, shall I also create a GitHub Action Workflow config to run the script once a week automatically? Based on a cron schedule. Or should we do that as a separate PR?

Copy link
Contributor

@lilic lilic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

Lets add the github job here so we can solve any possible issues with the script.

This is to start measuring the test flakiness and see the numbers improving once we improve and deflake flaky tests

Fixes etcd-io#13167
…ommits with failed status

The workflow runs on a cron schedule on a weekly basis - once every week

Fixes etcd-io#13167
@karuppiah7890
Copy link
Contributor Author

I changed the script file's name to fix the typo - flakyness vs flakiness

The script output currently looks like this -

$ ./scripts/measure-test-flakiness.sh 
Commit status failure percentage is - 30 %

I have also added the GitHub Actions Workflow config now. It uses the built in GITHUB_TOKEN from GitHub Actions Workflow. Usually it's injected based on some conditions - I think admins have to enabled tokens on PRs from forks? Something to look out for in case that's something we haven't discussed or needed before

Also, should we add a make target in the Makefile for running this? And use that in the GitHub Action Workflow config?

Looking forward to the review of the workflow config! :) cc @lilic @serathius

@karuppiah7890
Copy link
Contributor Author

I run the workflow every week once using a cron schedule - https://crontab.guru/#0_0_*_*_0

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@serathius
Copy link
Member

@ptabor @gyuho @hexfusion Can an maintainer take a look?

@lilic
Copy link
Contributor

lilic commented Aug 5, 2021

@karuppiah7890 there seems to be a failing test, it looks like a flake only but do you mind just having a quick look. Thanks!

Otherwise agreed lgtm :)

@karuppiah7890
Copy link
Contributor Author

@lilic makes sense! I'll take a look!

@karuppiah7890
Copy link
Contributor Author

@lilic The errors don't seem related to this PR. I was checking this - https://github.com/etcd-io/etcd/pull/13175/checks?check_run_id=3191066005 and https://semaphoreci.com/etcd-io/etcd/branches/pull-request-13175/builds/8 . I don't have much context, and it was a lot of logs. I just know that https://github.com/etcd-io/etcd/pull/13175/checks?check_run_id=3191066005 is a Golang Test failure for some test(s) and not related to this PR , same for https://semaphoreci.com/etcd-io/etcd/branches/pull-request-13175/builds/8 . But I'll try to dig more into it to contribute to the tests including flaky tests in my upcoming PRs :)

@stale
Copy link

stale bot commented Dec 13, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 13, 2021
@stale stale bot closed this Jan 5, 2022
@karuppiah7890
Copy link
Contributor Author

cc @lilic @serathius , some help here? Or should I reopen the PR?

@serathius serathius reopened this Jan 9, 2022
@stale stale bot removed the stale label Jan 9, 2022
@serathius
Copy link
Member

@ptabor

@serathius serathius merged commit 8d8271f into etcd-io:main Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Deflake etcd tests
4 participants