Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate tests to GitHub Actions #3430

Draft
wants to merge 47 commits into
base: master
Choose a base branch
from

Conversation

mwiencek
Copy link
Member

Problem

  • Our Selenium tests (on Jenkins) are slow. They used to take up to an hour, though since moving to a newer server, they take under 30 minutes.
  • Spreading our test workflows across different services (CircleCI and Jenkins) isn't ideal, but CircleCI doesn't seem to have enough or as much free resources as GitHub Actions to run our Selenium tests there.
  • All of our Jenkins configuration is stored inside Jenkins as opposed to version control. We never looked into converting things to use Jenkinsfiles/pipelines.
  • Jenkins requires additional maintenance (of our repository, containers, upgrades to plugins and Jenkins itself).
  • Having to manually build and push new musicbrainz-tests image versions whenever changes are made to Dockerfile.tests is annoying.
  • Our JavaScript code coverage reports are incomplete: they only include coverage from the Selenium tests.

Solution

  • Splits the Selenium tests into four parallel jobs which can be re-run individually.
  • Moves all of our tests away from CircleCI/Jenkins and onto GitHub Actions.
  • Combines the JS, Perl, pgTAP, and Selenium tests into a single CI workflow stored in git (.github/workflows/ci.yml).
  • Allows us to move all other jobs running on Jenkins to GitHub Actions in the future, so that we can retire Jenkins for good.
  • Builds the tests image as part of the CI workflow (we no longer have to manually build and push tests images).
  • Extracts JS coverage from our t/web.js and Perl tests, and merges it with the coverage from our Selenium tests.

Other changes:

  • Converts Dockerfile.tests to a multi-stage build.

One downside I'll note is that we can no longer SSH into running jobs (like on CircleCI) by default. There might be a way to emulate this feature if we need it.

Testing

Just the new GitHub Actions workflow!

@mwiencek
Copy link
Member Author

This was (finally) passing at https://github.com/metabrainz/musicbrainz-server/actions/runs/12457761240 but there are probably (definitely) some issues with running them as part of a pull request that I'll need to address. 😛

@mwiencek mwiencek marked this pull request as draft December 22, 2024 22:28
HACKING-PROD.md Outdated
`/home/musicbrainz/musicbrainz-server/`, and then you can run any test you want to check like this:

$ sudo -E -H -u musicbrainz carton exec -- prove -lv t/tests.t :: --tests Failing::Test
Finally, you will need to update the `musicbrainz-tests` image version in
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Reminder to myself that this documentation needs to be updated again.)

@mwiencek mwiencek force-pushed the github-actions branch 14 times, most recently from 96379b6 to 1cfc071 Compare February 11, 2025 20:27
@mwiencek mwiencek force-pushed the github-actions branch 3 times, most recently from 8e37f1a to cdb05db Compare February 28, 2025 06:45
mwiencek added 9 commits March 4, 2025 15:32
Recent Selenium builds have been failing with

  WebDriverError: disconnected: not connected to DevTools
    (failed to check if window was closed: disconnected: not connected to DevTools)

even though we haven't changed anything that I'm aware of. Trying to update our
Chrome dependencies here to see if that helps in any way.

Also bump selenium-webdriver to 4.27.0; I've combined it with this commit
because the changelog suggests it's somewhat tied to specific Chrome versions:
https://github.com/SeleniumHQ/selenium/blob/trunk/javascript/node/selenium-webdriver/CHANGES.md
This started failing during CI runs recently, and I have no clue why. (If you
run the tests manually in your browser, it will also fail if you click on the
page before it executes.)

I'm just removing the test because the old autocomplete code is on the way out
anyway.
We'd like to move Selenium tests away from Jenkins (which we moved to in
machine with heavy MBS traffic, which can cause test slowdowns and flakiness.
Maintaining Jenkins also isn't free and requires us to store some CI
configuration outside of git.

GitHub Actions looks like a suitable alternative for us, as its default runners
provide more vCPUs and RAM (4 + 16GB) than CircleCI's (2 + 4GB), while also
providing unlimited build minutes. Since we already use GitHub, that's one less
service for us to rely on, and GHA integrates better with GitHub.

The main disadvantage I could find is that there's no built-in way to SSH into
a running tests container on GHA.
See previous commit, "Migrate from CircleCI to GitHub Actions."
It's now symlinked from the GitHub Actions workspace.
The metabrainz/musicbrainz-tests image will be built and cached prior to the
two test jobs running. It no longer need to be built and pushed by hand in
advance.
mwiencek added 4 commits March 5, 2025 16:42
This comment wasn't entirely correct, as generating cpanfile.snapshot had
nothing to do with Chrome updating. That was a side effect of rebuilding
Dockerfile.tests, not cpanfile.snapshot. And it's no longer an issue since
da2d499 anyway.
Pull requests don't have permission to push to ghcr.io, so we have to cache the
image ourselves between jobs.
@mwiencek mwiencek force-pushed the github-actions branch 5 times, most recently from 8875622 to 7dc48de Compare March 6, 2025 02:09
mwiencek added 6 commits March 5, 2025 20:22
This may or may not help with debugging issues related to elements not being
findable on the page.

The screenshot is uploaded to the build artifacts.
The previous nyc_download step does not fail if no artifacts are found.
It seems that Chrome's `--headless=new` is now just `--headless`, with the old
`--headless` becoming `--headless=old` [1]. But `--headless=new` still works,
too.

However, Firefox doesn't start in headless mode if you use `--headless=new`.
Plain old `--headless` works for both browsers.

[1] https://developer.chrome.com/docs/chromium/headless
mwiencek added 12 commits March 5, 2025 20:50
These can take up quite a bit of space in the (limited) GitHub artifacts
storage, and are only really useful if a particular test fails.
This seems to bypass some "element not clickable" issues in Firefox.
It made sense to use false prior to ce43326
when we had the old `handleAlert` commands verifying that these were shown, but
now we must have the Firefox behavior align with Chrome.
Fixes some issues running the tests in Firefox.
While start_server's output is seen by svlogd fine, the plackup output appears
to be buffered when not attached to a tty.
I'm seeing these kinds of failures in the Selenium tests:

  [browser console log] [SEVERE]
  http://mbtest:5000/ws/js/artist/?q=gr%C3%B6up%20member&page=1&direct=false
  - Failed to load resource: the server responded with a status of 500
  (Internal Server Error)

The plackup logs show the request being dispatched, but the Solr logs don't
show any indication that it was received. I'm curious what LWP receives as the
response.

This logging should also be useful in production for debugging search errors.
While we'll still no longer need to rebuild and push test images by hand, we'll
still have to bump `TEST_IMAGE_TAG`. Which should shave several minutes off
each build using a cached image.
Trigger an input event to ensure the validation code responsible for toggling
the disabled state is run.
The logging added in 647114c did not provide
any further information, so this will allow us to see the response returned to
the browser.
 * Logs all console entries at the time they occur.
 * Works in Firefox.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant