`pyhf`: pure-Python implementation of HistFactory with tensors and automatic differentiation

.huge[Lukas Heinrich], .huge[Matthew Feickert], .huge.blue[Giordon Stark]

.huge[(SCIPP, UC Santa Cruz)]

gstark@cern.ch

ATLAS Statistics Forum

December 10th, 2020

`pyhf` team

Lukas Heinrich

CERN ] .kol-1-3.center[ .circle.width-80[]

Matthew Feickert

Illinois ] .kol-1-3.center[ .circle.width-75[]

Giordon Stark

UCSC SCIPP ] ]

`pyhf`: HistFactory in pure Python

First non-ROOT implementation of the HistFactory p.d.f. template
- .width-50[]
pure-Python library with Python and CLI API
- $ pip install pyhf
- No dependence on ROOT!
Open source tool for all of HEP
- IRIS-HEP supported Scikit-HEP project
- Used for reinterpretation in phenomenology paper
  (DOI: 10.1007/JHEP04(2019)144) and SModelS (arXiv:2009.01809)
- Already in use by ATLAS SUSY groups, HH combination group, and for internal
  pMSSM SUSY large scale reinterpretation ] .kol-1-3.center[ .width-100[] .width-100[] ]

Dependencies

Required dependencies

Core libraries (though all lightweight installs):

SciPy - Scientific Python (optimization routines)
click - Command line interface
tqdm - Progress bars
jsonschema - HistFactory JSON specification
jsonpatch - Signal reinterpretation
PyYAML - Command line niceties ] .kol-1-2[

Optional dependencies

Depending on what users want to do:

TensorFlow - autodiff, GPUs
PyTorch - autodiff, GPUs
JAX - autodiff, GPUs, jit
iminuit - alternative minimizer choice
uproot - ROOT I/O interop ]

$ python -m pip install --upgrade pyhf[xmlio] # Gets uproot
$ python -m pip install --upgrade pyhf[backends] # Gets all backends
$ python -m pip install --upgrade pyhf[jax,xmlio,minuit] # Gets JAX, uproot, and iminuit

]

Open Source Industry Tools for Computation

All numerical operations implemented in .bold[tensor backends] through an API of $n$-dimensional array operations
Using deep learning frameworks as computational backends allows for .bold[exploitation of autodiff and GPU acceleration]
As huge buy in from industry we benefit for free as these frameworks are .bold[continually improved] by professional software engineers (physicists are not)

.kol-1-2.center[ .width-90[] ] .kol-1-2[

Show hardware acceleration giving .bold[order of magnitude speedup] for some models!
Improvements over traditional
- 10 hrs to 30 min; 20 min to 10 sec ] ] .kol-1-4.center[ .width-85[] .width-85[] .width-85[]

Current Features

Unconstrained and constrained fits
Exclusion fits
Discovery fits (imminent v0.6.0 release)
Conversion to/from XML+ROOT to JSON
- This works with any HistFactory workspace! (HistFitter, TRExFitter WSMaker, etc... don't need to do anything special)
Brazil bands
Pull plots^†
Impact/ranking plots^†
pseudoexperiments ("toys") (imminent v0.6.0 release)

.smaller[^†Note: the pyhf API is meant to allow for higher-level frameworks to build on top, such as cabinetry.

Missing a meta-language (DSL, metadata) that describes the data that can be passed to plotting utilities
cabinetry is meant to help with plotting things "correctly"
All of this work is openly developed with extensive feedback ]

See our roadmap to get an idea of where we're going!

Automatic Differentiation of `pyhf` Models

With tensor library backends gain access to exact (higher order) derivatives — accuracy is only limited by floating point precision

$$ \frac{\partial L}{\partial \mu}, \frac{\partial L}{\partial \theta_{i}} $$

.grid[ .kol-1-2[ .large[Exploit .bold[full gradient of the likelihood] with .bold[modern optimizers] to help speedup fit!]

.large[Gain this through the frameworks creating computational directed acyclic graphs and then applying the chain rule (to the operations)] ] .kol-1-2[ .center.width-80[] ] ]

HEP Example: Likelihood Gradients

.kol-1-2.center[ .width-90[] ] .kol-1-2.center[ .width-90[] ]

.bold.center[Having access to the gradients makes the fit orders of magnitude faster than finite difference]

Documentation and Development

.grid[ .kol-1-1.center[All documentation can be found at https://scikit-hep.org/pyhf/.] .kol-1-2[ In this documentation you can find a list of:

Presentations
Tutorials
Posters
Media outreach

All of our documentation is tested nightly, against our software, as well as updates to software and tools we depend on. In addition to this, we've made full use of:

Sphinx - main documentation
Jupyter - fundamentals and tutorials

] .kol-1-2[ Most recently gave a successful, in-depth tutorial at the ATLAS SUSY+Exotics workshop.

Why do users choose us?

.grid[ .kol-2-3.push-1-6.center.gray[ Out of all the toolkits, why do you think your users choose to use yours? ] .kol-1-1[

Easy to use and install: PyPI, TestPyPI (bleeding edge), conda-forge, and Docker
Fast code, fast development cycle, fast feedback
Well-documented Python implementations, clear communication channels to devs and community
Command line complements the Pythonic API
- We really love our CLI, it plays nicely with shell "behavior" such as piping
```
$ pyhf prune --sample ttbar BkgOnly.json | pyhf inspect
```
Significant test-driven development (underlies all of our work) with 1000+ tests!
```
$ pytest --collect-only | grep "<Function\|<Class" -c
1306
```
Every commit tested in CI across Python 3.6, 3.7, 3.8 on Linux and MacOS systems with nightlies

But we believe the biggest reason users choose `pyhf` is because .center.huge[`pyhf` is developed openly and freely] ] ]

Common scripts/macros/functions?

.grid[ .kol-2-3.push-1-6.center.gray[ Is your toolkit using some external packages / common scripts / macros / functions to perform some of the operations like fit, limit setting, significance computation, Asimov-creation, ranking plot? ] .kol-1-1[

Fits, limit setting: SciPy and minuit
Test statistics are implemented in pyhf
Asimov creation: just a fit in pyhf to generate the Asimov dataset ] ]

Common software for ATLAS?

.grid[ .kol-2-3.push-1-6.center.gray[ Which pieces of your toolkit could be factorized out into a package that would be developed/supported/distributed by ATLAS? ] .kol-1-1[
We don't necessarily believe any particular piece needs to be factorized out into a package maintained by ATLAS.

pure-Python implementation of HistFactory (a mathematical model)
pyhf is a low(er)-level library to interact with the HistFactory JSON workspaces
Higher-level tools are encouraged to build on top of pyhf to extend the functionality into plots, limit setting, and other debugging utilities
- c.f. cabinetry as excellent example ] ]

Additional common software?

.grid[ .kol-2-3.push-1-6.center.gray[ What additional common software could your toolkit take advantage of? ] .kol-1-1[

Not sure
We are willing to try out new ideas all the time
If you have ideas, get in touch with us! ] ]

Contributing to central toolkit?

.grid[ .kol-2-3.push-1-6.center.gray[ Would you be willing to contribute to the development of a centrally distributed toolkit that provides functionality for providing common statistical operations (e.g. calculating a $p$-value)? ] .kol-1-1[

Cannot make any promises at this time
All core developers are very busy with convener roles and contact roles in ATLAS and IRIS-HEP ] ]

The Bigger Picture

reproducible workflows via RECAST/REANA benefits from JSON HistFactory
- reinterpretation is a breeze
- statistical workspaces can be serialized/preserved
Reinterpretation Forum paper recommends the use of pyhf likelihoods
SModelS provides an interface for pyhf
Native HEPdata support (ongoing!)
ATLAS SUSY group has published pyhf JSON HistFactory workspaces for five analyses ] .kol-1-3[

.center.width-100.tiny[ [![cranmer talk](figures/two_tastes.png)](https://indico.cern.ch/event/962997/)
(stolen from Kyle Cranmer) ] ]

Summary

.large[.bold[Accelerated] fitting library]
- reducing time to insight/inference!
- Hardware acceleration on GPUs and vectorized operations
- Backend agnostic Python API and CLI
.large[Flexible .bold[declarative] schema]
- JSON: ubiquitous, universal support, versionable
.large[Enabling technology for .bold[reinterpretation]]
- JSON Patch files for efficient computation of new signal models
- Unifying tool for theoretical and experimental physicists
.large[Project in growing .bold[Pythonic HEP ecosystem]]
- Openly developed on GitHub and welcome contributions
- Comprehensive open tutorials
- Ask us about Scikit-HEP and IRIS-HEP! ] .kol-1-3[

.center.width-100[[![pyhf_logo](https://iris-hep.org/assets/logos/pyhf-logo.png)](https://github.com/scikit-hep/pyhf)] ]

Thanks for listening!

Come talk with us!

.large[www.scikit-hep.org/pyhf] ] .grid[ .kol-1-3.center[ .width-90[] ] .kol-1-3.center[
.width-90[] ] .kol-1-3.center[

.width-100[] ] ]

External dependencies

Required dependencies from our setup.cfg:

install_requires =
    scipy>=1.4.0
    click>=6.0
    tqdm
    jsonschema>=3.2.0
    jsonpatch
    pyyaml

SciPy - Scientific Python (optimization routines)
click - Command line interface
tqdm - Progress bars
jsonschema - HistFactory JSON specification
jsonpatch - Signal reinterpretation
pyyaml - Command line niceties ] .kol-1-3.center[ .width-50[]

.width-50[]

.width-25[] ] ]

Optional dependencies

We have lots of optional dependencies depending on what users want to do:

TensorFlow - autodiff
PyTorch - autodiff
JAX - autodiff, jit
iminuit - minuit interface (MIGRAD/HESSE/MINOS available)
uproot - ROOT I/O interop

HistFactory Model

A flexible probability density function (p.d.f.) template to build statistical models in high energy physics
Developed in 2011 during work that lead to the Higgs discovery [CERN-OPEN-2012-016]
Widely used by the HEP community for .bold[measurements of known physics] (Standard Model) and
.bold[searches for new physics] (beyond the Standard Model)

.kol-2-5.center[ .width-90[] .bold[Standard Model] ] .kol-3-5.center[ .width-100[] .bold[Beyond the Standard Model] ]

HistFactory Template

$$ f\left(\mathrm{data}\middle|\mathrm{parameters}\right) = f\left(\vec{n}, \vec{a}\middle|\vec{\eta}, \vec{\chi}\right) = \color{blue}{\prod_{c \,\in\, \textrm{channels}} \prod_{b \,\in\, \textrm{bins}_c} \textrm{Pois} \left(n_{cb} \middle| \nu_{cb}\left(\vec{\eta}, \vec{\chi}\right)\right)} \,\color{red}{\prod_{\chi \,\in\, \vec{\chi}} c_{\chi} \left(a_{\chi}\middle|\chi\right)} $$

.bold[Use:] Multiple disjoint channels (or regions) of binned distributions with multiple samples contributing to each with additional (possibly shared) systematics between sample estimates

.blue[Main Poisson p.d.f. for simultaneous measurement of multiple channels]
.katex[Event rates] $\nu_{cb}$ (nominal rate $\nu_{scb}^{0}$ with rate modifiers)
.red[Constraint p.d.f. (+ data) for "auxiliary measurements"]
- encode systematic uncertainties (e.g. normalization, shape)
$\vec{n}$: events, $\vec{a}$: auxiliary data, $\vec{\eta}$: unconstrained pars, $\vec{\chi}$: constrained pars ] .kol-1-2[ .center.width-100[] .center[Example: .bold[Each bin] is separate (1-bin) channel,
each .bold[histogram] (color) is a sample and share
a .bold[normalization systematic] uncertainty] ]

HistFactory Template

$$ f\left(\vec{n}, \vec{a}\middle|\vec{\eta}, \vec{\chi}\right) = \color{blue}{\prod_{c \,\in\, \textrm{channels}} \prod_{b \,\in\, \textrm{bins}_c} \textrm{Pois} \left(n_{cb} \middle| \nu_{cb}\left(\vec{\eta}, \vec{\chi}\right)\right)} \,\color{red}{\prod_{\chi \,\in\, \vec{\chi}} c_{\chi} \left(a_{\chi}\middle|\chi\right)} $$

Mathematical grammar for a simultaneous fit with

.blue[multiple "channels"] (analysis regions, (stacks of) histograms)
each region can have .blue[multiple bins]
coupled to a set of .red[constraint terms]

.center[.bold[This is a _mathematical_ representation!] Nowhere is any software spec defined] .center[.bold[Until recently] (2018), the only implementation of HistFactory has been in [`ROOT`](https://root.cern.ch/)]

HistFactory Template (in more detail)

$$ f\left(\vec{n}, \vec{a}\middle|\vec{\eta}, \vec{\chi}\right) = \color{blue}{\prod_{c \,\in\, \textrm{channels}} \prod_{b \,\in\, \textrm{bins}_c} \textrm{Pois} \left(n_{cb} \middle| \nu_{cb}\left(\vec{\eta}, \vec{\chi}\right)\right)} \,\color{red}{\prod_{\chi \,\in\, \vec{\chi}} c_{\chi} \left(a_{\chi}\middle|\chi\right)} $$

$$ \nu_{cb}(\vec{\eta}, \vec{\chi}) = \sum_{s \,\in\, \textrm{samples}} \underbrace{\left(\sum_{\kappa \,\in\, \vec{\kappa}} \kappa_{scb}(\vec{\eta}, \vec{\chi})\right)}_{\textrm{multiplicative}} \Bigg(\nu_{scb}^{0}(\vec{\eta}, \vec{\chi}) + \underbrace{\sum_{\Delta \,\in\, \vec{\Delta}} \Delta_{scb}(\vec{\eta}, \vec{\chi})}_{\textrm{additive}}\Bigg) $$

.bold[Use:] Multiple disjoint channels (or regions) of binned distributions with multiple samples contributing to each with additional (possibly shared) systematics between sample estimates

.blue[Main Poisson p.d.f. for simultaneous measurement of multiple channels]
.katex[Event rates] $\nu_{cb}$ from nominal rate $\nu_{scb}^{0}$ and rate modifiers $\kappa$ and $\Delta$
.red[Constraint p.d.f. (+ data) for "auxiliary measurements"]
- encoding systematic uncertainties (normalization, shape, etc)
$\vec{n}$: events, $\vec{a}$: auxiliary data, $\vec{\eta}$: unconstrained pars, $\vec{\chi}$: constrained pars

Why is the likelihood important?

.kol-1-2.width-90[

High information-density summary of analysis
Almost everything we do in the analysis ultimately affects the likelihood and is encapsulated in it
- Trigger
- Detector
- Combined Performance / Physics Object Groups
- Systematic Uncertainties
- Event Selection
Unique representation of the analysis to reuse and preserve ] .kol-1-2.width-100[

]

References

F. James, Y. Perrin, L. Lyons, .italic[Workshop on confidence limits: Proceedings], 2000.
ROOT collaboration, K. Cranmer, G. Lewis, L. Moneta, A. Shibata and W. Verkerke, .italic[HistFactory: A tool for creating statistical models for use with RooFit and RooStats], 2012.
L. Heinrich, H. Schulz, J. Turner and Y. Zhou, .italic[Constraining $A_{4}$ Leptonic Flavour Model Parameters at Colliders and Beyond], 2018.
A. Read, .italic[Modified frequentist analysis of search results (the $\mathrm{CL}_{s}$ method)], 2000.
K. Cranmer, .italic[CERN Latin-American School of High-Energy Physics: Statistics for Particle Physicists], 2013.
ATLAS collaboration, .italic[Search for bottom-squark pair production with the ATLAS detector in final states containing Higgs bosons, b-jets and missing transverse momentum], 2019
ATLAS collaboration, .italic[Reproducing searches for new physics with the ATLAS experiment through publication of full statistical likelihoods], 2019
ATLAS collaboration, .italic[Search for bottom-squark pair production with the ATLAS detector in final states containing Higgs bosons, b-jets and missing transverse momentum: HEPData entry], 2019

The end.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

talk.md

talk.md

`pyhf`: pure-Python implementation of HistFactory with tensors and automatic differentiation

`pyhf` team

`pyhf`: HistFactory in pure Python

Dependencies

Required dependencies

Optional dependencies

Open Source Industry Tools for Computation

Current Features

Automatic Differentiation of `pyhf` Models

HEP Example: Likelihood Gradients

Documentation and Development

Why do users choose us?

But we believe the biggest reason users choose `pyhf` is because .center.huge[`pyhf` is developed openly and freely] ] ]

Common scripts/macros/functions?

Common software for ATLAS?

Additional common software?

Contributing to central toolkit?

The Bigger Picture

Summary

Thanks for listening!

Come talk with us!

External dependencies

Optional dependencies

HistFactory Model

HistFactory Template

HistFactory Template

HistFactory Template (in more detail)

Why is the likelihood important?

References

Files

talk.md

Latest commit

History

talk.md

File metadata and controls

pyhf: pure-Python implementation of HistFactory with tensors and automatic differentiation

pyhf team

pyhf: HistFactory in pure Python

Dependencies

Required dependencies

Optional dependencies

Open Source Industry Tools for Computation

Current Features

Automatic Differentiation of pyhf Models

HEP Example: Likelihood Gradients

Documentation and Development

Why do users choose us?

But we believe the biggest reason users choose pyhf is because .center.huge[pyhf is developed openly and freely] ] ]

Common scripts/macros/functions?

Common software for ATLAS?

Additional common software?

Contributing to central toolkit?

The Bigger Picture

Summary

Thanks for listening!

Come talk with us!

External dependencies

Optional dependencies

HistFactory Model

HistFactory Template

HistFactory Template

HistFactory Template (in more detail)

Why is the likelihood important?

References

`pyhf`: pure-Python implementation of HistFactory with tensors and automatic differentiation

`pyhf` team

`pyhf`: HistFactory in pure Python

Automatic Differentiation of `pyhf` Models

But we believe the biggest reason users choose `pyhf` is because .center.huge[`pyhf` is developed openly and freely] ] ]