Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API to calculate the median expected discovery significance #1810

Open
1 task done
matthewfeickert opened this issue Mar 11, 2022 · 3 comments
Open
1 task done
Assignees
Labels
API Changes the public API feat/enhancement New feature or request

Comments

@matthewfeickert
Copy link
Member

matthewfeickert commented Mar 11, 2022

Summary

In PR #1232 we added the ability to calculate the discovery test statistic. 👍 That's all fine. But if you want to calculate the median expected significance in the presence of a signal you would need to calculate q0(mu^=1) (that's supposed to be mu-hat), so we need some way to be able to force the value of the mu^ to be the mu_Asimov. This is currently not allowed with the v0.6.3 pyhf.infer.test_statistics.q0 API, though in the v0.7.0 API we will have the return_fitted_pars=True option

if return_fitted_pars:
return q0_stat, (mubhathat, muhatbhat)

to return (mu^, theta^^) and (mu^, theta^).

It would be good if we provided a way to calculate the median expected significance.

Additional Information

I was reminded of this today when reading through Eilam Gross's lectures from the 2017 European School of High-Energy Physics, in particular Slide 79 "The Median Sensitivity (via ASIMOV)".

This is motivated by trying to work on a translation recipe for WSMaker in Issue #1341.

Code of Conduct

  • I agree to follow the Code of Conduct
@matthewfeickert matthewfeickert added bug Something isn't working feat/enhancement New feature or request labels Mar 11, 2022
@matthewfeickert
Copy link
Member Author

matthewfeickert commented Mar 11, 2022

I should note of course, that you can do something similar to like what @alexander-held is doing now in cabinetry with cabinetry.fit.significance

import pyhf
from scipy.stats import norm

model = pyhf.simplemodels.uncorrelated_background([6], [9], [3])
data = [12] + model.config.auxdata
init_pars = model.config.suggested_init()
par_bounds = model.config.suggested_bounds()
fixed_params = model.config.suggested_fixed()

obs_p_val, exp_p_val = pyhf.infer.hypotest(
    0.0,
    data,
    model,
    init_pars=init_pars,
    fixed_params=fixed_params,
    par_bounds=par_bounds,
    test_stat="q0",
    return_expected=True,
)
obs_p_val = float(obs_p_val)
exp_p_val = float(exp_p_val)
obs_significance = norm.isf(obs_p_val, 0, 1)
exp_significance = norm.isf(exp_p_val, 0, 1)

print(f"Observed significance: {obs_significance}")
print(f"Expected significance: {exp_significance}")
Observed significance: 0.6557754926523803
Expected significance: 1.2926190691157395

but this still faces the same problems if someone has a workspace in which they have fixed the POI (e.g. WSMaker intentionally fixes the POI to 1 or something to blind the workspace.) This is a more general problem with workspace conversion though (c.f. Issue #1341) that isn't really part of this Issue.

Offering an straightforward API might be better.

@matthewfeickert matthewfeickert added API Changes the public API and removed bug Something isn't working labels Mar 11, 2022
@matthewfeickert
Copy link
Member Author

Yeah, I think just stealing what cabinetry is doing and then also implementing Issue #1712 would be the way to go here.

@matthewfeickert matthewfeickert changed the title Add ability to calculate the median expected discovery significance Add API to calculate the median expected discovery significance Mar 11, 2022
@matthewfeickert matthewfeickert self-assigned this Mar 11, 2022
@alexander-held
Copy link
Member

alexander-held commented Mar 11, 2022

I'm not sure whether I understand the direction of this issue correctly. Is it specifically about calculating expected discovery significance without any reference to observed data? cabinetry.fit.significance does not do that, it merely uses the existing pyhf API to get both observed + expected discovery significance. The expected discovery significance is evaluated with Asimov data built from a fit to observed data with mu=1 fixed. This already exists within pyhf.infer.hypotest, and the value of 1 is fixed:

asimov_mu = 1.0 if self.test_stat == 'q0' else 0.0

(There may be some value in allowing other values here, but users could also re-scale their signal instead.)

Is this what

so we need some way to be able to force the value of the mu^ to be the mu_Asimov

refers to?

If this is about doing a fully blind evaluation ("pre-fit Asimov") of the expected discovery significance, then the way to do this in cabinetry would be via cabinetry.model_utils.asimov_data (there are different possibilities for the treatment of free-floating parameters, cabinetry uses the initial parameter settings as nominal values). This works also with pure pyhf (depending on what exactly is desired as a dataset) by passing a suitable data to hypotest.

but this still faces the same problems if someone has a workspace in which they have fixed the POI (e.g. WSMaker intentionally fixes the POI to 1 or something to blind the workspace.) This is a more general problem with workspace conversion though (c.f. Issue #1341) that isn't really part of this Issue.

I see two options here, either ignoring what is set in the workspace for convenience (so that the calculation can go forward), or raising an error with a suitable message. I see value in both approaches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Changes the public API feat/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants