feat: Add discovery p0 test statistic #1232

kratsg · 2020-12-19T16:48:37Z

Description

Supersedes #520. Particularly it's just a full reimplementation of @nikoladze 's work but on top of more recent pyhf master. I initially tried to rebase, but there's been a lot of changes in the infer API that I felt it was better to pick out all the pieces, and re-implement it accordingly.

To-Do:

~~how to handle poi_test gracefully? Should q0 simply warn and set poi_test or should we fail harder?~~ -- see feat: Cleanup internal names for hypotest #1247
- hypotest should handle more gracefully, but the calculators should not
how to handle asimov_mu gracefully? Should hypotest or AsymptoticCalculator check if asimov_mu is consistent with the test statistic being used? Should the test statistic functions have a func.default_asimov_mu attribute? -- see feat: Cleanup internal names for hypotest #1247
~~what to name all the various "p-values" returned via hypotest? (can't reasonably call it CLs or w/e) [related discussion: feat: customizable metrics in hypotest #966]~~ -- see feat: Cleanup internal names for hypotest #1247
~~should AsymptoticCalculator return a b_only_distribution for q0 (what is this? we're very stuck on signal/background namings here)~~ -- see feat: Cleanup internal names for hypotest #1247

Idea: change poi_test to alternative_mu (alt_mu) and asimov_mu to null_mu. This perhaps clarifies the meaning of the testing at least. Then we need to have toys to support these distributions. Then maybe change from signal_plus_b to alternative_distribution and b_only to null_distribution ?

ReadTheDocs build: https://pyhf.readthedocs.io/en/feat-discoveryteststat/_generated/pyhf.infer.test_statistics.q0.html

Closes #520

Checklist Before Requesting Reviewer

Tests are passing
"WIP" removed from the title of the pull request
Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

Summarize commit messages into a comprehensive review of the PR

* Add discovery test statistic q0
* Teach the calculators to switch the asimov mu depending on the test statistic (to be refactored in later PR)
* Add tests for discovery test stat
* Add validation ROOT files and tests

Co-authored-by: Nikolai Hartmann <[email protected]>

codecov · 2020-12-19T18:01:26Z

Codecov Report

Merging #1232 (ea3eb21) into master (eb2449f) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1232      +/-   ##
==========================================
+ Coverage   97.47%   97.48%   +0.01%     
==========================================
  Files          63       63              
  Lines        3716     3733      +17     
  Branches      525      530       +5     
==========================================
+ Hits         3622     3639      +17     
  Misses         55       55              
  Partials       39       39

Flag	Coverage Δ
unittests	`97.48% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/pyhf/infer/__init__.py	`100.00% <100.00%> (ø)`
src/pyhf/infer/calculators.py	`100.00% <100.00%> (ø)`
src/pyhf/infer/test_statistics.py	`100.00% <100.00%> (ø)`
src/pyhf/infer/utils.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eb2449f...ea3eb21. Read the comment docs.

matthewfeickert

@kratsg overall this looks really good but let me think a bit more about the questions you raise in the PR body. Also for adding new binaries like validation/multibin_histfactory_p0/data/data.root for validation, do we want to add more binaires to the repo? Or should we have these added to scikit-hep-testdata?

validation/run_single_q0.py

kratsg · 2021-01-05T17:54:14Z

@kratsg overall this looks really good but let me think a bit more about the questions you raise in the PR body. Also for adding new binaries like validation/multibin_histfactory_p0/data/data.root for validation, do we want to add more binaires to the repo? Or should we have these added to scikit-hep-testdata?

I think we should add them to this repo for now. If we then decide to migrate later -- we should. But we should start considering using json2xml functionality to generate on-the-fly binaries in the future... I would be ok with removing it here, and doing an on-the-fly generation for tests.

matthewfeickert

@kratsg Thanks a lot for this pretty critical PR. 👍

Technically everything looks good I think. Though I agree that we should probably take this opportunity to consider the terminology and the APIs to try and make things as clear as possible. So tagging @lukasheinrich and @alexander-held here for additional comments and thoughts.

src/pyhf/infer/test_statistics.py

validation/run_single_q0.py

src/pyhf/infer/__init__.py

matthewfeickert · 2021-01-08T09:17:03Z

I think we should add them to this repo for now. If we then decide to migrate later -- we should. But we should start considering using json2xml functionality to generate on-the-fly binaries in the future... I would be ok with removing it here, and doing an on-the-fly generation for tests.

Yeah they're all told only a few kB so I think it is fine. Though at the same time I wouldn't be against doing on-the-fly generation here.

alexander-held

Thanks for all the work on this! I tried out the API yesterday, it was straightforward to adopt and in agreement with a ROOT reference for the examples tested.

I do not have any immediate suggestions for naming, but think that the naming inside hypotest may in the long term be difficult to read. @kratsg's idea of alternative/null sounds like it could work (maybe with extra comments calling things by their name in the respective branches in the function?).

src/pyhf/infer/test_statistics.py

src/pyhf/infer/utils.py

validation/multibin_histfactory_p0/results/example_results.table

matthewfeickert

@kratsg Once jax-ml/jax#5374 is resolved and @lukasheinrich reviews then I think this can go in, given that you've made dedicated followup PRs or Issues for most of my revision requests. Can you take care of the

Summarize commit messages into a comprehensive review of the PR

though?

src/pyhf/infer/__init__.py

validation/run_single_q0.py

tests/test_validation.py

matthewfeickert · 2021-01-12T22:36:09Z

The complexity of the fixtures in test_validation is getting increasingly more difficult to come back to and make sense of, and it is now over 1000 lines long. In another PR we need to either revise all of it or include some sort of walkthrough of how things work.

matthewfeickert

LGTM now

kratsg · 2021-01-13T16:03:45Z

The complexity of the fixtures in test_validation is getting increasingly more difficult to come back to and make sense of, and it is now over 1000 lines long. In another PR we need to either revise all of it or include some sort of walkthrough of how things work.

I have ideas on how to simplify it. Validation is very tricky to do, but I have a good idea of how to refactor it into something slightly more maintainble.

Co-authored-by: Matthew Feickert <[email protected]>

kratsg added feat/enhancement New feature or request API Changes the public API labels Dec 19, 2020

kratsg requested review from lukasheinrich and matthewfeickert December 19, 2020 16:48

kratsg self-assigned this Dec 19, 2020

kratsg force-pushed the feat/discoveryTestStat branch from 02b8cbc to fa85220 Compare December 19, 2020 18:17

kratsg marked this pull request as ready for review December 19, 2020 18:54

matthewfeickert reviewed Dec 21, 2020

View reviewed changes

validation/run_single_q0.py Outdated Show resolved Hide resolved

kratsg force-pushed the feat/discoveryTestStat branch from 2b67a8a to c9140c3 Compare January 4, 2021 17:04

kratsg mentioned this pull request Jan 4, 2021

Updating validation scripts to be more pythonic (main/import locations) #1241

Closed

kratsg requested a review from matthewfeickert January 6, 2021 00:58

This was referenced Jan 6, 2021

Discovery significance calculation scikit-hep/cabinetry#100

Closed

feat: discovery significance scikit-hep/cabinetry#176

Merged

matthewfeickert force-pushed the feat/discoveryTestStat branch from 34c903d to 7dda900 Compare January 8, 2021 03:53

matthewfeickert reviewed Jan 8, 2021

View reviewed changes

src/pyhf/infer/test_statistics.py Outdated Show resolved Hide resolved

validation/run_single_q0.py Show resolved Hide resolved

src/pyhf/infer/__init__.py Show resolved Hide resolved

alexander-held reviewed Jan 8, 2021

View reviewed changes

src/pyhf/infer/test_statistics.py Outdated Show resolved Hide resolved

src/pyhf/infer/utils.py Show resolved Hide resolved

validation/multibin_histfactory_p0/results/example_results.table Outdated Show resolved Hide resolved

kratsg mentioned this pull request Jan 11, 2021

Renaming test statistics to use the mu subscript more explicitly #1246

Open

matthewfeickert approved these changes Jan 12, 2021

View reviewed changes

src/pyhf/infer/__init__.py Show resolved Hide resolved

validation/run_single_q0.py Show resolved Hide resolved

matthewfeickert requested changes Jan 12, 2021

View reviewed changes

tests/test_validation.py Show resolved Hide resolved

This was referenced Jan 12, 2021

Document the setup and running of test_validation #1250

Open

Duplicated assert in test_independent unit test #1251

Closed

fix: Don't use exact assert for test_probability.py #1252

Merged

matthewfeickert approved these changes Jan 13, 2021

View reviewed changes

kratsg and others added 22 commits January 15, 2021 10:11

add in the validation information

2e60660

fix up docstring

92bc90a

fix up q0 docstring

1b1b9a8

fix

5a347a7

forgot to add the validation data lol

565c031

fix logic bug

2784b57

fix asimov_mu for now

a7472cc

improve coverage

a1abd3e

more coverage

866020d

fix

0395cab

sigh sigh sigh

30c830d

sigh

4a68366

update script a bit more

b377138

Update src/pyhf/infer/test_statistics.py

540d84b

Co-authored-by: Matthew Feickert <[email protected]>

update warning message a bit more

4b50ccc

remove unneeded file for validation

5c62619

explicitly mention setting mu=0

3aa346e

fix for toys

82f9022

add calctype to tests and validate toys as well

a6c31a6

remove breakpoint

bd9b61d

fix toys

4ba1200

switch to approx, use kwargs to clean up slightly

1f64782

matthewfeickert force-pushed the feat/discoveryTestStat branch from 41eda33 to 1f64782 Compare January 15, 2021 16:12

Merge branch 'master' into feat/discoveryTestStat

ea3eb21

kratsg merged commit bc140a7 into master Jan 16, 2021

kratsg deleted the feat/discoveryTestStat branch January 16, 2021 18:52

This was referenced Jan 16, 2021

feat: Add discovery p0 test statistic #520

Closed

fix: Report expected p-values based on their quantiles under background hypotheses #1162

Merged

matthewfeickert mentioned this pull request Mar 11, 2022

Add API to calculate the median expected discovery significance #1810

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add discovery p0 test statistic #1232

feat: Add discovery p0 test statistic #1232

kratsg commented Dec 19, 2020 •

edited by matthewfeickert

Loading

codecov bot commented Dec 19, 2020 •

edited

Loading

matthewfeickert left a comment

kratsg commented Jan 5, 2021

matthewfeickert left a comment

matthewfeickert commented Jan 8, 2021

alexander-held left a comment

matthewfeickert left a comment

matthewfeickert commented Jan 12, 2021

matthewfeickert left a comment

kratsg commented Jan 13, 2021

feat: Add discovery p0 test statistic #1232

feat: Add discovery p0 test statistic #1232

Conversation

kratsg commented Dec 19, 2020 • edited by matthewfeickert Loading

Description

Checklist Before Requesting Reviewer

Before Merging

codecov bot commented Dec 19, 2020 • edited Loading

Codecov Report

matthewfeickert left a comment

Choose a reason for hiding this comment

kratsg commented Jan 5, 2021

matthewfeickert left a comment

Choose a reason for hiding this comment

matthewfeickert commented Jan 8, 2021

alexander-held left a comment

Choose a reason for hiding this comment

matthewfeickert left a comment

Choose a reason for hiding this comment

matthewfeickert commented Jan 12, 2021

matthewfeickert left a comment

Choose a reason for hiding this comment

kratsg commented Jan 13, 2021

kratsg commented Dec 19, 2020 •

edited by matthewfeickert

Loading

codecov bot commented Dec 19, 2020 •

edited

Loading