Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot testing #90

Closed
2 tasks done
flange-ipb opened this issue Oct 25, 2022 · 8 comments · Fixed by #91
Closed
2 tasks done

Snapshot testing #90

flange-ipb opened this issue Oct 25, 2022 · 8 comments · Fixed by #91

Comments

@flange-ipb
Copy link
Collaborator

flange-ipb commented Oct 25, 2022

Motivation:
At the moment we manually curate TUCAN strings only for a small number of the test molfiles with test_serialization.py's test_regression. This should be extended to the whole set of test molfiles - with automation. It will help us to detect changes in the outcome of the algorithm.

About snapshot testing:
See Jest's snapshot testing manual (for React components).

The general principle should be:

  • First test run: run the serialization with all molfiles, save all TUCAN strings in a snapshot file; the snapshot file is committed to the git repository.
  • All further test runs: compare results with the existing snapshot file and fail on changes.

Tasks:

  • find a suitable snapshot library for Python
  • implement snapshot test
@flange-ipb
Copy link
Collaborator Author

SnapshotTest

Works with pytest and offers a test fixture called snapshot. test_regression becomes

def test_regression(m, snapshot):
    m_serialized = serialize_molecule(canonicalize_molecule(m))
    snapshot.assert_match(m_serialized)

The snapshot file is tests/snapshots/snap_test_serialization.py and looks like this:

...
snapshots = Snapshot()  # a dict

snapshots['test_regression[10tertButyl10isopropyltridecanoic acid] 1'] = 'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-44)(13-45)(14-45)(15-45)(16-46)(17-46)(18-46)(19-47)(20-47)(21-48)(22-48)(23-49)(24-49)(25-50)(26-50)(27-51)(28-51)(29-52)(30-52)(31-53)(32-53)(33-54)(34-54)(35-55)(36-55)(37-56)(38-56)(39-57)(40-62)(41-47)(42-57)(43-57)(44-58)(45-58)(46-58)(47-54)(48-49)(48-50)(49-51)(50-52)(51-53)(52-55)(53-56)(54-59)(55-59)(56-60)(57-59)(58-59)(60-61)(60-62)'

snapshots['test_regression[14Ethyl14methylheptadecanoic acid] 1'] = 'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-45)(13-45)(14-46)(15-46)(16-47)(17-47)(18-48)(19-48)(20-49)(21-49)(22-50)(23-50)(24-51)(25-51)(26-52)(27-52)(28-53)(29-53)(30-54)(31-54)(32-55)(33-55)(34-56)(35-56)(36-57)(37-57)(38-58)(39-58)(40-62)(41-44)(42-55)(43-59)(44-56)(45-46)(45-47)(46-48)(47-49)(48-50)(49-51)(50-52)(51-53)(52-54)(53-57)(54-58)(55-59)(56-59)(57-59)(58-60)(60-61)(60-62)'
...
  • When test failures happen (i.e. calculated TUCAN does not match the snapshot TUCAN), the snapshot can be updated by running pytest with the --snapshot-update parameter.
  • Additions (= add a new molfile and run the test) will add a new snapshot entry without test failure.
  • Molfile deletions will not change the snapshot entries. This can be achived via the --snapshot-update parameter.

@flange-ipb
Copy link
Collaborator Author

pytest-snapshot

Similar principles like in SnapshotTest. It offers more customization, but we probably won't need that.

test_regression becomes

def test_regression(m, snapshot):
    m_serialized = serialize_molecule(canonicalize_molecule(m))
    snapshot.assert_match(m_serialized, "tucan.txt")

There is one snapshot file for each molfile, e.g.
tests/snapshots/test_something/test_serialization/1_2_dimethylnaphthalene/tucan.txt,
tests/snapshots/test_something/test_serialization/1_3_dimethylnaphthalene/tucan.txt
...

It's a mess!

@flange-ipb
Copy link
Collaborator Author

flange-ipb commented Oct 25, 2022

syrupy

test_regression becomes

def test_regression(m, snapshot):
    m_serialized = serialize_molecule(canonicalize_molecule(m))
    assert m_serialized == snapshot

Snapshot file tests/__snapshots__/test_serialization.ambr:

# name: test_regression[10tertButyl10isopropyltridecanoic acid]
  'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-44)(13-45)(14-45)(15-45)(16-46)(17-46)(18-46)(19-47)(20-47)(21-48)(22-48)(23-49)(24-49)(25-50)(26-50)(27-51)(28-51)(29-52)(30-52)(31-53)(32-53)(33-54)(34-54)(35-55)(36-55)(37-56)(38-56)(39-57)(40-62)(41-47)(42-57)(43-57)(44-58)(45-58)(46-58)(47-54)(48-49)(48-50)(49-51)(50-52)(51-53)(52-55)(53-56)(54-59)(55-59)(56-60)(57-59)(58-59)(60-61)(60-62)'
# ---
# name: test_regression[14Ethyl14methylheptadecanoic acid]
  'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-45)(13-45)(14-46)(15-46)(16-47)(17-47)(18-48)(19-48)(20-49)(21-49)(22-50)(23-50)(24-51)(25-51)(26-52)(27-52)(28-53)(29-53)(30-54)(31-54)(32-55)(33-55)(34-56)(35-56)(36-57)(37-57)(38-58)(39-58)(40-62)(41-44)(42-55)(43-59)(44-56)(45-46)(45-47)(46-48)(47-49)(48-50)(49-51)(50-52)(51-53)(52-54)(53-57)(54-58)(55-59)(56-59)(57-59)(58-60)(60-61)(60-62)'
# ---
...

More or less same mechanics like in SnapshotTest, except molfile additions always require pytest --snapshot-update.

@schatzsc
Copy link
Collaborator

That would indeed be super-interesting but is again "way beyond my paygrade" - maybe Jan can comment?

The only thing I would find important is that auto-generated strings should be checked by human for some critical cases.

@JanCBrammer
Copy link
Collaborator

I really like the idea of running regression tests on our entire testset. Snapshot testing looks like a great solution for that!
Regarding the choice of library for snapshot testing, I don't have a strong opinion. https://github.com/syrusakbary/snapshottest looks like a solid option. I'd favor it because of the output file. It's less well maintained than https://github.com/tophat/syrupy though.

@flange-ipb
Copy link
Collaborator Author

Indeed, I'd also favour SnapshotTest for its simplicity and for its pythonic snapshot file.

@JanCBrammer
Copy link
Collaborator

@flange-ipb
Copy link
Collaborator Author

@JanCBrammer and I discussed the impact on the CI pipeline:
Let's assume a new test molfile was added and test_regression was not executed to update the snapshot file. With SnapshotTest it is difficult to detect this situation in the CI and fail. This is essentially what snapshottest#73 requests.

For this reason we changed to syrupy. The CI run will fail in the situation described above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants