Handle validation tasks inside CPAchecker #1078

marian-lingsch · 2024-09-16T16:32:08Z

The goal of this MR is to handle validation tasks following the proposed format in SV-Benchmarks. A short description of the new option is given in the SV-Benchmarks README

The goal is to leave the previous behavior of CPAchecker untouched i.e. it still being able to validate witnesses given explicitly through extra parameters in a benchmark definition file. If both ways to pass witnesses, through the task-definition and through the benchmark file, are used then CPAchecker fails with an error, as should be expected.

…e, with an indication that it is a witness inside the options field See https://gitlab.com/sosy-lab/benchmarking/sv-benchmarks/-/blob/36ba9aa5c3f5d7b52c7f1431a2fff287ebc7e5d9/c/loop-invariants/linear-inequality-inv-a.yml for an example task

PhilippWendler

I marked the PR as draft as long as the witness option is still a proposal and not community agreed.

The PR should link to the documentation that has the definition of that option.

benchexec/tools/cpachecker.py

Addresses: #1078 (comment)

Addresses: - #1078 (comment) - #1078 (comment)

marian-lingsch · 2024-09-17T08:40:22Z

Thanks a lot for your comments!

They have now been addressed.

PhilippWendler · 2024-09-18T06:20:34Z

There are several tool-info modules that build on top of the tool-info module of CPAchecker. Most of them do not override cmdline, so should be safe, but I think we should go through each of grep cmdline $(grep -Rl cpachecker benchexec/tools/) and check whether they need adjustments. In particular cpa-witness2test.py even calls _get_additional_options().

…tion file

…ot passed through it, but instead through the additional options

marian-lingsch · 2024-09-18T09:15:54Z

There are several tool-info modules that build on top of the tool-info module of CPAchecker. Most of them do not override cmdline, so should be safe, but I think we should go through each of grep cmdline $(grep -Rl cpachecker benchexec/tools/) and check whether they need adjustments. In particular cpa-witness2test.py even calls _get_additional_options().

Thanks a lot for pointing this out!

Just went through all the different tools which use CPAchecker's tool info module. Only CPA-witness2test had to be adjusted, the others should not be affected. MetaVal also needs to be adapated, but that will be a problem for another PR.

PhilippWendler · 2024-09-18T09:21:38Z

There are several tool-info modules that build on top of the tool-info module of CPAchecker. Most of them do not override cmdline, so should be safe, but I think we should go through each of grep cmdline $(grep -Rl cpachecker benchexec/tools/) and check whether they need adjustments. In particular cpa-witness2test.py even calls _get_additional_options().

Thanks a lot for pointing this out!

Just went through all the different tools which use CPAchecker's tool info module. Only CPA-witness2test had to be adjusted, the others should not be affected.

Hm, but now cpa-witness2test.py does not fail anymore if there is more than one (non-witness) input file. And is -witness the correct way to pass witnesses to cpa-witness2test? Did you test this tool-info module with your new tasks as well?

MetaVal also needs to be adapated, but that will be a problem for another PR.

Hm, what does that mean? How will the tool-info module behave in the meantime? It should still continue work for existing users.

marian-lingsch · 2024-09-18T10:47:18Z

There are several tool-info modules that build on top of the tool-info module of CPAchecker. Most of them do not override cmdline, so should be safe, but I think we should go through each of grep cmdline $(grep -Rl cpachecker benchexec/tools/) and check whether they need adjustments. In particular cpa-witness2test.py even calls _get_additional_options().

Thanks a lot for pointing this out!
Just went through all the different tools which use CPAchecker's tool info module. Only CPA-witness2test had to be adjusted, the others should not be affected.

Hm, but now cpa-witness2test.py does not fail anymore if there is more than one (non-witness) input file. And is -witness the correct way to pass witnesses to cpa-witness2test? Did you test this tool-info module with your new tasks as well?

With the latest commit it fails again if there is more than one non-witness file.
-witness is also the correct way to pass the parameter as seen here. It also works as expected when running it on the benchmark definition files.

MetaVal also needs to be adapated, but that will be a problem for another PR.

Hm, what does that mean? How will the tool-info module behave in the meantime? It should still continue work for existing users.

Currently it will simply not work for the proposed validation task files. It will continue working as intended with the current way of validating witnesses.

PhilippWendler · 2024-09-18T11:07:49Z

MetaVal also needs to be adapated, but that will be a problem for another PR.

Hm, what does that mean? How will the tool-info module behave in the meantime? It should still continue work for existing users.

Currently it will simply not work for the proposed validation task files. It will continue working as intended with the current way of validating witnesses.

Ok, this is good to hear. What happens if someone passes a new task-definition file with the witness option? Will it just be ignored (like all other tool-info modules would do), or is the behavior worse than ignoring the additional option, i.e., would it fail or do something meaningless if a user passes a new task-definition file?

I think it would be good if we keep it such that every tool-info modules either fully supports some option, or ignores its existence completely. That is the least confusing for users. Otherwise we would have three different behaviors of the Metaval tool-info module for validation tasks: in old versions of BenchExec it just ignores the witness option, in future versions it will eventually support it, and in the meantime yet something else.

marian-lingsch · 2024-09-18T12:06:06Z

Ok, this is good to hear. What happens if someone passes a new task-definition file with the witness option? Will it just be ignored (like all other tool-info modules would do), or is the behavior worse than ignoring the additional option, i.e., would it fail or do something meaningless if a user passes a new task-definition file?

Currently MetaVal will just not work for the validation task definition format, since it ignores the witness option. Since it will be missing the expected witness option which is passed to it using --metavalWitness and will therefore just print its --help section when run.

I think it would be good if we keep it such that every tool-info modules either fully supports some option, or ignores its existence completely. That is the least confusing for users. Otherwise we would have three different behaviors of the Metaval tool-info module for validation tasks: in old versions of BenchExec it just ignores the witness option, in future versions it will eventually support it, and in the meantime yet something else.

I agree that keeping the behavior of validators consistent is important, but IMO this issue is orthogonal to this PR, since all validators will need to be adjusted in order to work with the validation format and handle the new option correctly.

If I understood your proposal correctly, for the next release of BenchExec all validators should have adapted their tool-info module such that there is a specific version of BenchExec from which the validation task definition files start working and previously the witness option will just be ignored.

PhilippWendler · 2024-09-18T12:24:01Z

Ok, this is good to hear. What happens if someone passes a new task-definition file with the witness option? Will it just be ignored (like all other tool-info modules would do), or is the behavior worse than ignoring the additional option, i.e., would it fail or do something meaningless if a user passes a new task-definition file?

Currently MetaVal will just not work for the validation task definition format, since it ignores the witness option. Since it will be missing the expected witness option which is passed to it using --metavalWitness and will therefore just print its --help section when run.

But not if I use an appropriate benchmark definition that includes the necessary options for witnesses (just as in the previous years), wouldn't it?

I think it would be good if we keep it such that every tool-info modules either fully supports some option, or ignores its existence completely. That is the least confusing for users. Otherwise we would have three different behaviors of the Metaval tool-info module for validation tasks: in old versions of BenchExec it just ignores the witness option, in future versions it will eventually support it, and in the meantime yet something else.

I agree that keeping the behavior of validators consistent is important, but IMO this issue is orthogonal to this PR, since all validators will need to be adjusted in order to work with the validation format and handle the new option correctly.

If I understood your proposal correctly, for the next release of BenchExec all validators should have adapted their tool-info module such that there is a specific version of BenchExec from which the validation task definition files start working and previously the witness option will just be ignored.

No, I never said this. I said that every tool-info modules should either support the option or ignore it completely. But I don't want that tool-info module X ignores option Y in versions A, B, C, does something that is broken in versions D and E, and fully supports option Y in versions F and beyond.

Of course it would be nice for users if "there is a specific version of BenchExec from which the validation task definition files start working and previously the witness option will just be ignored", but I would not require that.

marian-lingsch · 2024-09-18T13:17:28Z

Ok, this is good to hear. What happens if someone passes a new task-definition file with the witness option? Will it just be ignored (like all other tool-info modules would do), or is the behavior worse than ignoring the additional option, i.e., would it fail or do something meaningless if a user passes a new task-definition file?

Currently MetaVal will just not work for the validation task definition format, since it ignores the witness option. Since it will be missing the expected witness option which is passed to it using --metavalWitness and will therefore just print its --help section when run.

But not if I use an appropriate benchmark definition that includes the necessary options for witnesses (just as in the previous years), wouldn't it?

Exactly

I think it would be good if we keep it such that every tool-info modules either fully supports some option, or ignores its existence completely. That is the least confusing for users. Otherwise we would have three different behaviors of the Metaval tool-info module for validation tasks: in old versions of BenchExec it just ignores the witness option, in future versions it will eventually support it, and in the meantime yet something else.

I agree that keeping the behavior of validators consistent is important, but IMO this issue is orthogonal to this PR, since all validators will need to be adjusted in order to work with the validation format and handle the new option correctly.
If I understood your proposal correctly, for the next release of BenchExec all validators should have adapted their tool-info module such that there is a specific version of BenchExec from which the validation task definition files start working and previously the witness option will just be ignored.

No, I never said this. I said that every tool-info modules should either support the option or ignore it completely. But I don't want that tool-info module X ignores option Y in versions A, B, C, does something that is broken in versions D and E, and fully supports option Y in versions F and beyond.

Of course it would be nice for users if "there is a specific version of BenchExec from which the validation task definition files start working and previously the witness option will just be ignored", but I would not require that.

I see, then I somewhat misunderstood your comment. But I completely agree with you on this.

PhilippWendler · 2024-09-18T13:46:03Z

Ok, this is good to hear. What happens if someone passes a new task-definition file with the witness option? Will it just be ignored (like all other tool-info modules would do), or is the behavior worse than ignoring the additional option, i.e., would it fail or do something meaningless if a user passes a new task-definition file?

Currently MetaVal will just not work for the validation task definition format, since it ignores the witness option. Since it will be missing the expected witness option which is passed to it using --metavalWitness and will therefore just print its --help section when run.

But not if I use an appropriate benchmark definition that includes the necessary options for witnesses (just as in the previous years), wouldn't it?

Exactly

Then we should ensure that this PR does not cause a regression for this situation (new task definition with appropriate benchmark definition with manually defined options for Metaval).

marian-lingsch · 2024-09-18T14:49:34Z

Ok, this is good to hear. What happens if someone passes a new task-definition file with the witness option? Will it just be ignored (like all other tool-info modules would do), or is the behavior worse than ignoring the additional option, i.e., would it fail or do something meaningless if a user passes a new task-definition file?

Currently MetaVal will just not work for the validation task definition format, since it ignores the witness option. Since it will be missing the expected witness option which is passed to it using --metavalWitness and will therefore just print its --help section when run.

But not if I use an appropriate benchmark definition that includes the necessary options for witnesses (just as in the previous years), wouldn't it?

Exactly

Then we should ensure that this PR does not cause a regression for this situation (new task definition with appropriate benchmark definition with manually defined options for Metaval).

Looking at the tool info module of MetaVal a regression should not occur due to thi PR.

PhilippWendler

Can be merged once the community has agreed on and finalized the specification of the new option in the task-definition files. Would be nice to add a link to the spec to the code once this happens.

Addresses: #1078 (comment)

Addresses: - #1078 (comment) - #1078 (comment)

Addresses: #1078 (comment)

Addresses: - #1078 (comment) - #1078 (comment)

Marian Lingsch-Rosenfeld added 2 commits September 16, 2024 14:35

Fix warnings from ruff

f8a98b9

marian-lingsch requested a review from PhilippWendler September 16, 2024 16:32

marian-lingsch self-assigned this Sep 16, 2024

PhilippWendler marked this pull request as draft September 17, 2024 04:44

PhilippWendler reviewed Sep 17, 2024

View reviewed changes

benchexec/tools/cpachecker.py Outdated Show resolved Hide resolved

benchexec/tools/cpachecker.py Outdated Show resolved Hide resolved

benchexec/tools/cpachecker.py Outdated Show resolved Hide resolved

Marian Lingsch-Rosenfeld added 2 commits September 17, 2024 10:22

Change assertion to proper error message

872c632

Addresses: #1078 (comment)

Only sepparate the witness files from other input files once

658418a

Addresses: - #1078 (comment) - #1078 (comment)

Marian Lingsch-Rosenfeld added 2 commits September 18, 2024 11:05

We also want to filter for witness tasks when not using a task defini…

fd369ba

…tion file

Filter the input files in CPA-witness2test such that the witness is n…

6b2349b

…ot passed through it, but instead through the additional options

Enforce that there is only a single non-witness file in CPAWitness2Test

0b0a650

PhilippWendler approved these changes Sep 18, 2024

View reviewed changes

marian-lingsch pushed a commit that referenced this pull request Oct 24, 2024

Change assertion to proper error message

b66c78a

Addresses: #1078 (comment)

marian-lingsch pushed a commit that referenced this pull request Oct 24, 2024

Only sepparate the witness files from other input files once

247b842

Addresses: - #1078 (comment) - #1078 (comment)

marian-lingsch pushed a commit that referenced this pull request Oct 24, 2024

Change assertion to proper error message

43888c1

Addresses: #1078 (comment)

marian-lingsch pushed a commit that referenced this pull request Oct 24, 2024

Only sepparate the witness files from other input files once

9746433

Addresses: - #1078 (comment) - #1078 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle validation tasks inside CPAchecker #1078

Handle validation tasks inside CPAchecker #1078

marian-lingsch commented Sep 16, 2024 •

edited

Loading

PhilippWendler left a comment

marian-lingsch commented Sep 17, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler left a comment

Handle validation tasks inside CPAchecker #1078

Are you sure you want to change the base?

Handle validation tasks inside CPAchecker #1078

Conversation

marian-lingsch commented Sep 16, 2024 • edited Loading

PhilippWendler left a comment

Choose a reason for hiding this comment

marian-lingsch commented Sep 17, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler commented Sep 18, 2024

marian-lingsch commented Sep 18, 2024

PhilippWendler left a comment

Choose a reason for hiding this comment

marian-lingsch commented Sep 16, 2024 •

edited

Loading