Add API to calculate the median expected discovery significance #1810

matthewfeickert · 2022-03-11T00:16:07Z

Summary

In PR #1232 we added the ability to calculate the discovery test statistic. 👍 That's all fine. But if you want to calculate the median expected significance in the presence of a signal you would need to calculate q0(mu^=1) (that's supposed to be mu-hat), so we need some way to be able to force the value of the mu^ to be the mu_Asimov. This is currently not allowed with the v0.6.3 pyhf.infer.test_statistics.q0 API, though in the v0.7.0 API we will have the return_fitted_pars=True option

pyhf/src/pyhf/infer/test_statistics.py

Lines 494 to 495 in 05e1338

    
           if return_fitted_pars: 
        
               return q0_stat, (mubhathat, muhatbhat)

to return (mu^, theta^^) and (mu^, theta^).

It would be good if we provided a way to calculate the median expected significance.

Additional Information

I was reminded of this today when reading through Eilam Gross's lectures from the 2017 European School of High-Energy Physics, in particular Slide 79 "The Median Sensitivity (via ASIMOV)".

This is motivated by trying to work on a translation recipe for WSMaker in Issue #1341.

Code of Conduct

I agree to follow the Code of Conduct

The text was updated successfully, but these errors were encountered:

matthewfeickert · 2022-03-11T00:54:04Z

I should note of course, that you can do something similar to like what @alexander-held is doing now in cabinetry with cabinetry.fit.significance

import pyhf
from scipy.stats import norm

model = pyhf.simplemodels.uncorrelated_background([6], [9], [3])
data = [12] + model.config.auxdata
init_pars = model.config.suggested_init()
par_bounds = model.config.suggested_bounds()
fixed_params = model.config.suggested_fixed()

obs_p_val, exp_p_val = pyhf.infer.hypotest(
    0.0,
    data,
    model,
    init_pars=init_pars,
    fixed_params=fixed_params,
    par_bounds=par_bounds,
    test_stat="q0",
    return_expected=True,
)
obs_p_val = float(obs_p_val)
exp_p_val = float(exp_p_val)
obs_significance = norm.isf(obs_p_val, 0, 1)
exp_significance = norm.isf(exp_p_val, 0, 1)

print(f"Observed significance: {obs_significance}")
print(f"Expected significance: {exp_significance}")

Observed significance: 0.6557754926523803
Expected significance: 1.2926190691157395

but this still faces the same problems if someone has a workspace in which they have fixed the POI (e.g. WSMaker intentionally fixes the POI to 1 or something to blind the workspace.) This is a more general problem with workspace conversion though (c.f. Issue #1341) that isn't really part of this Issue.

Offering an straightforward API might be better.

matthewfeickert · 2022-03-11T06:04:06Z

Yeah, I think just stealing what cabinetry is doing and then also implementing Issue #1712 would be the way to go here.

alexander-held · 2022-03-11T10:43:43Z

I'm not sure whether I understand the direction of this issue correctly. Is it specifically about calculating expected discovery significance without any reference to observed data? cabinetry.fit.significance does not do that, it merely uses the existing pyhf API to get both observed + expected discovery significance. The expected discovery significance is evaluated with Asimov data built from a fit to observed data with mu=1 fixed. This already exists within pyhf.infer.hypotest, and the value of 1 is fixed:

pyhf/src/pyhf/infer/calculators.py

Line 378 in fc3cc62

asimov_mu = 1.0 if self.test_stat == 'q0' else 0.0

(There may be some value in allowing other values here, but users could also re-scale their signal instead.)

Is this what

so we need some way to be able to force the value of the mu^ to be the mu_Asimov

refers to?

If this is about doing a fully blind evaluation ("pre-fit Asimov") of the expected discovery significance, then the way to do this in cabinetry would be via cabinetry.model_utils.asimov_data (there are different possibilities for the treatment of free-floating parameters, cabinetry uses the initial parameter settings as nominal values). This works also with pure pyhf (depending on what exactly is desired as a dataset) by passing a suitable data to hypotest.

but this still faces the same problems if someone has a workspace in which they have fixed the POI (e.g. WSMaker intentionally fixes the POI to 1 or something to blind the workspace.) This is a more general problem with workspace conversion though (c.f. Issue #1341) that isn't really part of this Issue.

I see two options here, either ignoring what is set in the workspace for convenience (so that the calculation can go forward), or raising an error with a suitable message. I see value in both approaches.

matthewfeickert added bug Something isn't working feat/enhancement New feature or request labels Mar 11, 2022

matthewfeickert added API Changes the public API and removed bug Something isn't working labels Mar 11, 2022

matthewfeickert changed the title ~~Add ability to calculate the median expected discovery significance~~ Add API to calculate the median expected discovery significance Mar 11, 2022

matthewfeickert self-assigned this Mar 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add API to calculate the median expected discovery significance #1810

Add API to calculate the median expected discovery significance #1810

matthewfeickert commented Mar 11, 2022 •

edited

Loading

matthewfeickert commented Mar 11, 2022 •

edited

Loading

matthewfeickert commented Mar 11, 2022

alexander-held commented Mar 11, 2022 •

edited

Loading

Add API to calculate the median expected discovery significance #1810

Add API to calculate the median expected discovery significance #1810

Comments

matthewfeickert commented Mar 11, 2022 • edited Loading

Summary

Additional Information

Code of Conduct

matthewfeickert commented Mar 11, 2022 • edited Loading

matthewfeickert commented Mar 11, 2022

alexander-held commented Mar 11, 2022 • edited Loading

matthewfeickert commented Mar 11, 2022 •

edited

Loading

matthewfeickert commented Mar 11, 2022 •

edited

Loading

alexander-held commented Mar 11, 2022 •

edited

Loading