Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of SRI Testing harness with increased loading of many KPs/ARAs with large test data sets (or number of KPs) may require optimization #68

Open
5 of 15 tasks
RichardBruskiewich opened this issue Oct 27, 2022 · 0 comments
Assignees

Comments

@RichardBruskiewich
Copy link
Collaborator

RichardBruskiewich commented Oct 27, 2022

This issue is a broad spectrum concern of performance as the number of KPs and ARAs get added with substantial test parameters (# edges or # of KPs) since test_runs are slowing significantly. Some thoughts on this matter:

  • Break down the currently monolithic test runs initiated by Registry entries into separate KP and ARA test runs, then don't repeat test runs on KPs and ARAs that have already been run, unless they are "required".
    • Need to define the rules to be used for deciding whether or not a test run for a given KP is 'required': is this explicitly triggered by a user/owner of the resource or is some kind of policy of re-running a test run after a certain date, or based on some other system metric?
    • Test runs need to be constrained to only run specific KPs and/or ARAs (Issue Allow test_runs with specific KP(s) or ARA(s) specified #66 - resolved!).
    • Need a /registry web API endpoint to return the list of 'testable' results (DONE in commit cbf76fc)
    • The web dashboard will move away from a 'test run-centric' to a 'resource-centric with timestamps' organization with the "latest run of a given KP (ARA) validation" being of primary focus.
      • Some of the existing web API endpoints may need to be modified or (a) new endpoint(s) added to facilitate access to the 'latest' validation run for a given resource... maybe 'latest' is the default when the test_run_id is omitted from the API call?
        • Add an endpoint (or modify an existing endpoint?) that gives the list of test_run_ids ('timestamps') for a given KP or ARA resource - filters added to the /test_runs endpoint (commit #bc80bb4)
      • Access to 'latest' might be achievable using some kind of TestDatabase indexing of KPs and ARAs against their test runs (timestamps), perhaps in a separate document cross-cutting the various test run document sets (implemented as part of /test_runs filtering, in (commit #bc80bb4)
  • Even with a strategy in place to manage and retrieve the 'latest' test runs for a given KP (ARA), there may be a need to streamline the running of a larger number of such test runs (validating only a subset of KPs and ARAs - run sequentially or in an embarrassingly parallel manner).
    • Investigate use of Python Multiprocessor Pools to provide for parallel execution of validation on sufficiently endowed hardware (i.e. multi-CPU machines) or perhaps, methods to delegate processing over a cluster of dynamically configured machines (e.g. network of interacting Docker containers, perhaps running on other machines e.g. Kubernetes).
    • To support the above "embarrassingly parallel" validation work, perhaps implement some kind of 'job queue' under the hood (either inter server or across multiple servers, dynamically provisioned, e.g. a Kubernetes cluster?)
  • Review and repair the progress monitoring (i.e. currently, the /status endpoint implementation) for more informative and more granular operation(?) perhaps a (better) messaging queue
    • See if the progress percentage can be made credibly more granular (Usefully Done: in commit 9475cc9)
    • It is probably also the case that at some point of the PyTest conftest setup, we would actually know how many 'collected' unit tests are going to be run (that number is visible when we run the Pytest directly inside an IDE...). It may also help us to make that number visible to the API status endpoint.
    • Expect that the /status implementation may be affected by whatever mechanism is put in place for multiprocessing.
@RichardBruskiewich RichardBruskiewich self-assigned this Oct 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant