-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Implement end-to-end testing with Playwright #420
Conversation
This is an in-progress cleaned version of the final PR (numberscope#361) of the Delft student user interface project, which it supersedes. Comments from the original PR that remain relevant: * The end-to-end tests run using Firefox and Chromium. * New tests are in the e2e folder. * The tests often depend on specific classes and IDs, so they may need to be updated upon changes to Numberscope. * The tests can be executed as the following npm script: `npm run test:e2e` * An interactive testing UI and debugger can be executed as the following npm script: `npm run test:e2e:ui` Caveats concerning trying this cleaned PR and its status: * Make certain to run `npm install` after pulling this PR. * Many of the tests do not yet pass, perhaps because of the "specific classes and IDs" point mentioned above and the fact that ui2 has diverged significantly from the Delft PR series. * Tests are not yet run automatically prior to commit. * I do not think there are any image tests yet, we need to try to add them. * Tests are not yet performed in the continuous integration checks to be run on GitHub; they should be.
Wow, the testing is actually doing what it is supposed to! It uncovered #419 as part of its testing. That is to say, the first test I looked at was simply stale with respect to our diverged code. But the second test was operating exactly as designed, and it detected that the caption on the default speciment (modfill with random) is not correct if it is saved and then the gallery is viewed. So I will have to fix #419 in this PR to get it to a point in which it can be merged. This experience makes me very positive on this PR, and makes me feel that after this PR, every PR should be required to update testing: if it is a bugfix PR, it should include a test that would fail with the bug, and passes with the PR. And if it is a feature PR, it should include either unit or end-to-end tests of the feature, preferably both. Anyhow, I will continue by investigating #419 as soon as I have a chance. |
OK, I have resolved #419 in this branch, and now all of the end-to-end tests that the Delft team implemented are passing! As you can see from the above lists, there is still a lot to do before this PR is ready to merge, but @katestange @Vectornaut I would be happy for some initial review: code review, does pulling and |
Currently,
and
and the scope appears to work. |
Wow, looks like you actually ran the tests! Great. For all of those warnings at the beginning, note that version 16 of |
The strong desire for image tests led to a cascade of changes in this commit, mostly driven by the need to have reproducible images: - Removes all use of `sketch.noLoop()` and `sketch.loop()` in favor of the previously existing `stop()` and `continue()` visualizer methods, to allow: - Adds a `frames=NNN` query parameter to URLs to set the maximum number of frames a visualization may draw - Switches from the "static" instance of mathjs to a "dynamic" one, to allow its random number configuration to be controlled. In conjunction with this, moves all math functions into a single math module, as extensions of mathjs. - Removes all use of `Math.random()` in favor of the mathjs random generator - Adds a `randomSeed=AAAA` query parameter to URLs to make the mathjs random generator reproducible. - Documents all of the above changes.
OK, I have pushed the code that facilitates image tests and institutes them for all featured gallery specimens. Presuming it is reproducible on other machines, this approach gives us the benefit that any time another featured specimen is added, another test is automatically added. (When in the future we remove featured specimens, we may want to move them to a dedicated test specimen list, if they are still of interest for testing purposes.) @katestange @Vectornaut it would be awesome to see if the newly-added tests pass on either of your machines. Note they will definitely fail on MacOS or Windows, as Playwright image tests are enforced to be OS-specific. When we need to enable development on other platforms, I'll have to create a Windows PR (I have a dual-boot machine kicking around somewhere), and someone with access to a Mac will have to do likewise. Fingers crossed and looking forward to hearing back your results on these new tests. |
Also adds new tests for `src/shared/defineFeatured.ts` and corrects the documentation extraction facility for the package manager scripts. Resolves numberscope#25. Resolves numberscope#73. Resolves numberscope#246.
In other words, it should never call calculate twice for the same index. This is tested by 10K random accesses to indices less than 1M, followed by accessing the first 10K entries, followed by accessing the last 10K entries. Hopefully that should suffice. Resolves numberscope#54.
This is an initial pass at addressing numberscope#113. Note, however, that ModFill is not reporting to the person doing visualization that it is running with different parameter values than shown. So that still must be done, but for that part we will need a resolution to numberscope#112, which will be a sufficiently involve change that we should leave it to a spearate PR from this numberscope#420.
Resolves numberscope#174.
Just to make sure I understand correctly, this PR is going to mean that every time I try to commit some new code, even just along the way in something I'm working on but isn't polished yet, we're running alllll these tests, right, and I can't commit if anything fails? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more thoughts.
Indeed, this PR adopts a model that a commit is a checkpoint at which the code works. That's not an uncommon code hygiene practice. (But it is by no means universal.) Your working version can be as broken as you like, but it should work at each commit. We could switch to only enforcing tests on PRs but in my experience that leads to people making PRs that don't actually pass the tests, since they end up not always bothering to run all the tests before pushing. Since the modal PR is just one or two commits, tests on every commit seems reasonable to me. I suppose we could look into tests as pre-push rather than pre-commit actions, if you feel testing on every commit is too onerous. |
Nope, I'm happy, I just wanted to make sure I understand. |
OK, I think I've addressed your latest batch of comments (thanks for them!). Looking forward to your further review when you get a chance. |
In histogram, because you can pan and zoom, the mouseover isn't working right anymore (it doesn't adjust properly to the panned and zoomed setup). |
Question: how do the tests decide how many workers to use? |
Ok, at the moment I can't think of too much more to test or ask. I've tried out all the things I can think of trying. Tests are passing. So just the mouseover in histogram. |
I literally have no idea. Is this something you'd like me to investigate? |
Yeah, I knew that. The original code has no contemplation that the display could have changed at all. I was sort of hoping that could be left for the future, but that was fatigue on my part. (Frankly I am not quite sure how the mouseover code can tell which box the mouse is over if there's been a combination of panning and zooming, but I am sure it is possible somehow...) I will try to fix tonight. If not, it will likely be Friday at the soonest because I am heading to Seattle tomorrow for a SUMM meeting. |
No, was just curious. |
Well it could be added to the Histogram overhaul issue if you'd prefer, that would be fine with me. You could also turn off mouse controls for now, and add turning them with mouseover working again to Histogram overhaul. I don't have anything else for this PR to ask for at the moment, so maybe we should just go for merging? It is an unusually large PR. :) |
No, it's OK. Should be done now. There were other related aspects to fix, such as not having the text get unreadably tiny when zoomed out, and keeping dragging proportional when zoomed in or out. Take a look and let me know what you think; it seems from your comments that this is the last thing on your list for merging. Sorry I was kvetching, it was just me being lazy and not wanting to figure out the coordinate transform from mouse coordinates to plot coordinates depending on the pan and zoom. But fortunately the situation is simple enough (no rotations, for example) and I've done this sort of thing enough that I could just do it by trial and error: get it working for arbitrary pan, no zoom; then get it working for no pan, arbitrary zoom; and then finally combining those to get the general case... |
I just put it in a fixed position on screen, regardless of zoom or pan. How's that? |
As far as I can tell, that only occurs on the last frame or maybe the last two frames before zooming all the way through the graph so you can't see anything (that is literally what happens). In other words, the formulae are working just right except in case of an extremely close-up view. Mathematically, there should be no difference for those last two frames, as far as I can see. I am supposing that due to some kind of quantization in possible text sizes, or some kind of integer rounding phenomenon, at this extreme close-up range, the text height is actually being reported inaccurately by p5. So I couldn't figure out how to get that last frame or two to look perfect. So I let well enough be, as an edge case that's not of too much practical import. It's certainly readable all the way up to the last frame but one, and looking at the graph at that very last zoom level is unlikely to be useful, and anyway you would already know what that caption says by the time you get there. You are welcome to: Sorry if that was a cranky response ;-) |
Haha, you're entitled to a cranky response! I think we should merge this and celebrate! |
This is an in-progress cleaned version of the final PR (#361) of
the Delft student user interface project, which it supersedes.
Comments from the original PR that remain relevant:
npm run test:e2e
npm run test:e2e:ui
[but note that these tests run on your local machine natively, rather than in Docker, and so some image comparison tests will fail that will succeed when run non-interactively].Caveats concerning trying this cleaned PR and its status:
npm install
after pulling this PR.Many of the tests do not yet pass, perhaps because of the "specific classes and IDs" point mentioned above and the fact that ui2 has diverged significantly from the Delft PR series.Tests are not yet run automatically prior to commit.I do not think there are any image tests yet, we need to try to add them.Tests are not yet performed in the continuous integration checks to be run on GitHub; they should be.In addition, this PR corrects a number of accumulated issues and issues that were surfaced as a result of implementing the testing. Namely:
Resolves #25.
Resolves #54.
Resolves #73.
Resolves #174.
Resolves #217.
Resolves #219.
Resolves #225.
Resolves #226.
Resolves #246.
Resolves #292.
Resolves #294.
Resolves #311.
Resolves #419.
Resolves #458.
noLoop()
andloop()
methods with visualizerstop()
andcontinue()
methods, and documented the need to do this (necessary to ensure that the play/pause control is updated properly, as the testing watches this control to know when the visualization is done and so it is safe to take a snapshot).stop()
method to control the number of frames that a visualizer will run (again, needed to make the images produced by a visualizer deterministic).make
for building the app and testing it, to avoid expensive steps like generating a Docker image and so on from being re-run when they don't need to be.By submitting this PR, I am indicating to the Numberscope maintainers that I have read and understood the contributing guidelines and that this PR follows those guidelines to the best of my knowledge. I have also read the pull request checklist and followed the instructions therein.