Snapshot testing #90

flange-ipb · 2022-10-25T09:09:36Z

Motivation:
At the moment we manually curate TUCAN strings only for a small number of the test molfiles with test_serialization.py's test_regression. This should be extended to the whole set of test molfiles - with automation. It will help us to detect changes in the outcome of the algorithm.

About snapshot testing:
See Jest's snapshot testing manual (for React components).

The general principle should be:

First test run: run the serialization with all molfiles, save all TUCAN strings in a snapshot file; the snapshot file is committed to the git repository.
All further test runs: compare results with the existing snapshot file and fail on changes.

Tasks:

find a suitable snapshot library for Python
implement snapshot test

The text was updated successfully, but these errors were encountered:

flange-ipb · 2022-10-25T11:13:24Z

SnapshotTest

Works with pytest and offers a test fixture called snapshot. test_regression becomes

def test_regression(m, snapshot):
    m_serialized = serialize_molecule(canonicalize_molecule(m))
    snapshot.assert_match(m_serialized)

The snapshot file is tests/snapshots/snap_test_serialization.py and looks like this:

...
snapshots = Snapshot()  # a dict

snapshots['test_regression[10tertButyl10isopropyltridecanoic acid] 1'] = 'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-44)(13-45)(14-45)(15-45)(16-46)(17-46)(18-46)(19-47)(20-47)(21-48)(22-48)(23-49)(24-49)(25-50)(26-50)(27-51)(28-51)(29-52)(30-52)(31-53)(32-53)(33-54)(34-54)(35-55)(36-55)(37-56)(38-56)(39-57)(40-62)(41-47)(42-57)(43-57)(44-58)(45-58)(46-58)(47-54)(48-49)(48-50)(49-51)(50-52)(51-53)(52-55)(53-56)(54-59)(55-59)(56-60)(57-59)(58-59)(60-61)(60-62)'

snapshots['test_regression[14Ethyl14methylheptadecanoic acid] 1'] = 'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-45)(13-45)(14-46)(15-46)(16-47)(17-47)(18-48)(19-48)(20-49)(21-49)(22-50)(23-50)(24-51)(25-51)(26-52)(27-52)(28-53)(29-53)(30-54)(31-54)(32-55)(33-55)(34-56)(35-56)(36-57)(37-57)(38-58)(39-58)(40-62)(41-44)(42-55)(43-59)(44-56)(45-46)(45-47)(46-48)(47-49)(48-50)(49-51)(50-52)(51-53)(52-54)(53-57)(54-58)(55-59)(56-59)(57-59)(58-60)(60-61)(60-62)'
...

When test failures happen (i.e. calculated TUCAN does not match the snapshot TUCAN), the snapshot can be updated by running pytest with the --snapshot-update parameter.
Additions (= add a new molfile and run the test) will add a new snapshot entry without test failure.
Molfile deletions will not change the snapshot entries. This can be achived via the --snapshot-update parameter.

flange-ipb · 2022-10-25T11:42:59Z

pytest-snapshot

Similar principles like in SnapshotTest. It offers more customization, but we probably won't need that.

test_regression becomes

def test_regression(m, snapshot):
    m_serialized = serialize_molecule(canonicalize_molecule(m))
    snapshot.assert_match(m_serialized, "tucan.txt")

There is one snapshot file for each molfile, e.g.
tests/snapshots/test_something/test_serialization/1_2_dimethylnaphthalene/tucan.txt,
tests/snapshots/test_something/test_serialization/1_3_dimethylnaphthalene/tucan.txt
...

It's a mess!

flange-ipb · 2022-10-25T12:07:59Z

syrupy

test_regression becomes

def test_regression(m, snapshot):
    m_serialized = serialize_molecule(canonicalize_molecule(m))
    assert m_serialized == snapshot

Snapshot file tests/__snapshots__/test_serialization.ambr:

# name: test_regression[10tertButyl10isopropyltridecanoic acid]
  'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-44)(13-45)(14-45)(15-45)(16-46)(17-46)(18-46)(19-47)(20-47)(21-48)(22-48)(23-49)(24-49)(25-50)(26-50)(27-51)(28-51)(29-52)(30-52)(31-53)(32-53)(33-54)(34-54)(35-55)(36-55)(37-56)(38-56)(39-57)(40-62)(41-47)(42-57)(43-57)(44-58)(45-58)(46-58)(47-54)(48-49)(48-50)(49-51)(50-52)(51-53)(52-55)(53-56)(54-59)(55-59)(56-60)(57-59)(58-59)(60-61)(60-62)'
# ---
# name: test_regression[14Ethyl14methylheptadecanoic acid]
  'C20H40O2/(1-41)(2-41)(3-41)(4-42)(5-42)(6-42)(7-43)(8-43)(9-43)(10-44)(11-44)(12-45)(13-45)(14-46)(15-46)(16-47)(17-47)(18-48)(19-48)(20-49)(21-49)(22-50)(23-50)(24-51)(25-51)(26-52)(27-52)(28-53)(29-53)(30-54)(31-54)(32-55)(33-55)(34-56)(35-56)(36-57)(37-57)(38-58)(39-58)(40-62)(41-44)(42-55)(43-59)(44-56)(45-46)(45-47)(46-48)(47-49)(48-50)(49-51)(50-52)(51-53)(52-54)(53-57)(54-58)(55-59)(56-59)(57-59)(58-60)(60-61)(60-62)'
# ---
...

More or less same mechanics like in SnapshotTest, except molfile additions always require pytest --snapshot-update.

schatzsc · 2022-10-26T22:33:09Z

That would indeed be super-interesting but is again "way beyond my paygrade" - maybe Jan can comment?

The only thing I would find important is that auto-generated strings should be checked by human for some critical cases.

JanCBrammer · 2022-10-28T04:29:15Z

I really like the idea of running regression tests on our entire testset. Snapshot testing looks like a great solution for that!
Regarding the choice of library for snapshot testing, I don't have a strong opinion. https://github.com/syrusakbary/snapshottest looks like a solid option. I'd favor it because of the output file. It's less well maintained than https://github.com/tophat/syrupy though.

flange-ipb · 2022-10-28T07:50:29Z

Indeed, I'd also favour SnapshotTest for its simplicity and for its pythonic snapshot file.

JanCBrammer · 2022-10-28T14:13:02Z

@schatzsc, you can find the auto-generated TUCAN strings here: https://github.com/TUCAN-nest/TUCAN/blob/c69eb5cfdbf36bd0d53772fca3b12beac5712496/tests/__snapshots__/test_serialization.ambr

flange-ipb · 2022-10-28T14:19:36Z

@JanCBrammer and I discussed the impact on the CI pipeline:
Let's assume a new test molfile was added and test_regression was not executed to update the snapshot file. With SnapshotTest it is difficult to detect this situation in the CI and fail. This is essentially what snapshottest#73 requests.

For this reason we changed to syrupy. The CI run will fail in the situation described above.

flange-ipb mentioned this issue Oct 28, 2022

Snapshot testing for test_regression #91

Merged

JanCBrammer closed this as completed in #91 Oct 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot testing #90

Snapshot testing #90

flange-ipb commented Oct 25, 2022 •

edited

Loading

flange-ipb commented Oct 25, 2022

flange-ipb commented Oct 25, 2022

flange-ipb commented Oct 25, 2022 •

edited

Loading

schatzsc commented Oct 26, 2022

JanCBrammer commented Oct 28, 2022

flange-ipb commented Oct 28, 2022

JanCBrammer commented Oct 28, 2022

flange-ipb commented Oct 28, 2022

Snapshot testing #90

Snapshot testing #90

Comments

flange-ipb commented Oct 25, 2022 • edited Loading

flange-ipb commented Oct 25, 2022

flange-ipb commented Oct 25, 2022

flange-ipb commented Oct 25, 2022 • edited Loading

schatzsc commented Oct 26, 2022

JanCBrammer commented Oct 28, 2022

flange-ipb commented Oct 28, 2022

JanCBrammer commented Oct 28, 2022

flange-ipb commented Oct 28, 2022

flange-ipb commented Oct 25, 2022 •

edited

Loading

flange-ipb commented Oct 25, 2022 •

edited

Loading