Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for parallelization by test suite #154

Open
jameskraus opened this issue Aug 5, 2024 · 2 comments
Open

Support for parallelization by test suite #154

jameskraus opened this issue Aug 5, 2024 · 2 comments

Comments

@jameskraus
Copy link

jameskraus commented Aug 5, 2024

Right now it seems like test-splitter is designed to run a lot of tests for a single test suite. It would be nice if it could support the idea of parallelizing across test suites. This feedback is directed at future support for non-rspec test suites.

For example in our javascript monorepo pipeline right now we have many (100+) test suites which we dynamically look up and run. Our pipeline looks something like this right now:

  - label: 'Test Suites Batch %n'
    parallelism: 6
    command:
      - run-unit-tests.js

Where run-unit-tests.js will find a list of all test suites to run and deterministically run a random subset of those suites based on its batch # out of the total batches. In our particular example, running a suite looks like running a command nx run {suite}:test.

Maybe the main difference in this proposed approach vs. the current approach is to alter the assumption that test-splitter would be operating on a per-file basis? Another way to look at this is running groups of tests independent of the underlying individual files.

@niceking
Copy link
Collaborator

Hi @jameskraus thanks for raising this issue! I'd like to understand a bit more of your use case. It sounds like you want to select files for splitting across a number of different test suites?

What is the output of your run-unit-tests.js script? Does it specify the paths of all the test suites that have been selected?

I'm assuming all test files within a test suite are subpaths of the test suite? Like would it be possible to construct a file glob across multiple directories that select all tests from all selected test suites? We could then use that as an input into the test splitter

@jameskraus
Copy link
Author

It sounds like you want to select files for splitting across a number of different test suites?

Yes, sort of! We have a lot of different unit test suites and they all have different amounts of time they take. It would be very helpful to be able to parallelize over not just individual files, but also suites.

What is the output of your run-unit-tests.js script? Does it specify the paths of all the test suites that have been selected?

The output of the run-unit-tests.js file is the side effect of running the command nx run {suite}:test for a partition of the test suites. The logic in the script is roughly:

// Get all the suites to run for a given PR
const allSuites: Array<string> = await getAllSuitesToRun()

// Figure out which subset of suites to run for the current agent
const suitesBin: Array<string> = deterministicRandomPartition({
  seed: process.env['BUILDKITE_BUILD_NUMBER'],
  partitionIndex: process.env['BUILDKITE_PARALLEL_JOB'],
  totalPartitions: process.env['BUILDKITE_PARALLEL_JOB_COUNT']
  xs: allSuites
})

// Run the suites via nx, equivalent to for loop over `nx run {suite}:test`
try {
  execSync(`nx run-many --target=test --projects=${projects.join(',')`)
} catch (e) {
 // failure handling
}

I'm assuming all test files within a test suite are subpaths of the test suite? Like would it be possible to construct a file glob across multiple directories that select all tests from all selected test suites?

Hmm, we're basically wrapping the jest CLI, and so we could pass --listTests and get jest to list out all the files. I do wonder if we would need some type of reverse mapping from suite to file, since we likely would want to run each file with a command like nx run {suite}:test {file}.

Given that jest has startup/shutdown costs, it may be more efficient to run an entire suite at once, e.g. nx run {suite}:test and use that as the unit of parallelization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants