-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dapp mutate
#829
base: master
Are you sure you want to change the base?
dapp mutate
#829
Conversation
This adds a simple mutation testing framework to dapptools, allowing users to gain insight into which parts of their code are not well specified by their test suite.
wowwww very cool I'm wondering if it might be a good idea to break |
ah yes true that would probably be more convenient. We're a little constrained by the interface provided by universalmutator unfortunately (which doesn't e.g. allow for the generation of a specific number of mutants), but I'll have a play around and see if I can make it work... |
@gakonst something something rust re-write candidate? |
Not sure if you misunderstood me or not, but to clarify: with |
my guess is filtering is expensive purely because it goes over a large number of mutates and runs a bunch of tests, so the only way to seriously speed it up would be to make every test run faster -- which probably isn't a rust rewrite candidate? (this is the author of universalmutator chiming in) |
hmm. one thing to do, that might be better than total random selection, is to 1) generate mutants then 2) prioritize them and cut off at N? analyze_mutants doesn't take an N argument to run N mutants, but it can take a file, and prioritize takes an optional N and can return only the top N mutants, and dump those into a file |
I am admittedly out of my league here but my interpretation of this thread/demo was that tests already ~are running significantly faster in the [incomplete] rust version? happy to be corrected & have a better understanding of the interplay going forward ! |
In that case, yes, speeding up tests really really helps, since you basically get N mutants * (all (relevant) tests runtime) If you break on test failure, of course, that's a maximum and gets better the sooner you detect failures! |
The main issue for generating only N mutants in universalmutator is that right now they would be for the first <N lines only. I think I could add a mode to randomize the line processing order, if that would be useful? |
Note that if you're willing to call the mutator many times, one undocumented (and not well tested!) trick is to call the mutator with |
e.g., to get (maybe) one mutant of a dumb C program, hello.c:
it'll make one mutant, and if that mutant is invalid, you won't get anything, but if it is valid, you'll get it in |
Hi @agroce! Thanks so much for the input (and for building this very nice tool in the first place) 💖
This seems like exactly what we need, I'll play around with this and see if I can make it work 🙏 |
I hadn't noticed the |
Complicated -- a mix of location, nature of the change, and the before/after code. It's basically, right now, an ad hoc mess! But it seems somewhat useful, and I have an active NSF grant with @clegoues to make it more principled. |
@MrChico Not sure I quite understand what you mean here. What exactly would happen on each iteration? |
I was thinking one iteration is one round of mutate, filter, and that one may do multiple of those. But maybe its not more useful to do mutate, filter, mutate, filter... than to just create more mutations in the first place |
Description
This adds a simple mutation testing framework to dapptools based on
universalmutator
.Mutation testing can be thought of as a kind of reverse fuzzing, where instead of generating random inputs to our test functions, we make random mutations (i.e. introduce bugs) to the source code of the system under test, and then run the test suite to see if the mutation is detected.
The idea is that mutations that are undetected should highlight behaviours that are not well specified by the test suite and hopefully provide some insight to drive improvements to the test suite (or to uncover areas that should be targeted for special attention during audit).
I have been using
dss
as a testbed while developing, and despite the generally high coverage of the test suite there,dapp mutate
was able to uncover many bugs that would have slipped through the test suite (e.g. subtle changes to behaviour in math functions, flipped / altered conditions in important require statements).The workflow currently looks like this:
dapp mutate gen
. This will iterate over every non test file underDAPP_SRC
and generate mutated versions. Solc is invoked as a part of this process to ensure that the mutated version compiles.dapp mutate filter
. This will iterate over every generated mutant and check to see if applying the mutation would be detected by the test suite.While
dapp mutate filter
is running, users can calldapp mutate status
to get an overview of the current progress, ordapp mutate show-diffs
to see the mutations that were not detected by the test suite.Filter in particular is especially time consuming, and it's probably worth running it overnight or perhaps disabling some particulary time consuming tests. This seems acceptable since users will probably not want to run
dapp mutate
on every commit / change, but would rather run the analysis periodically to check the status of their test suite.Still needs docs / changelogs and maybe some tests before merge.
Checklist