-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fitting tools, combined fits, partial wave analysis, and machine learning #5
Comments
Interested in this topic! I'm mostly interested in investigating interoperability between Combine (tool that is still used in CMS to perform combinations and fits pretty much in every analysis) and modern packages (pyhf, cabinetry, zfit, etc.). |
Also interested in discussing Python-based alternatives to Combine! |
jaxfit @nsmith- |
I am interested in adopting the work we did in RooFit and AD in Combine. |
I'd like to discuss with experts of the different fitting libraries the benefits and potential drawbacks of using JAX PyTrees. I made a small (proof-of-concept for now) package ( +1 |
+1 |
I'm interested in this topic, particularly fitting tools and amplitude analysis / PWA! Some thoughts:
Footnotes
|
Suggestion for a new topical issue |
I'm also interested in this. Another aspect to this topic is orchestration of model construction / metadata handling, which ties in with earlier steps in an analysis workflow (and #4). Regarding AD: also curious to learn more about how / how much of the functionality is exposed to users (i.e. can I easily take arbitrary derivatives myself, how limited are current implementations to just internally provide derivatives wrt. parameters to the minimizer). @redeboer: probably best to open a new issue as this might get lost here and is not related to the thread. Presumably interesting e.g. for scikit-hep/particle. |
Hi folks. Thanks for the ping. I'm aware of the new PDG API and in fact in touch with Juerg, the director :-). I do need to find time to have a proper look and comment ... But it is not forgotten and indeed a relevant thing how Particle sits/evolves vis-a-vis the new pdg package. |
✅ --> scikit-hep/particle#513 |
One thing that fits into that box is simulation-based inference à la e.g. MadMiner or various anomaly detection methods. |
can't attend in-person sadly but would love to be involved in any discussions here if possible (timezones permitting)! |
Hi All: I've been working on GooFit (https://github.com/GooFit/GooFit) for a decade now. One of its primary goals is doing time-dependent amplitude analyses with large data sets (think hundreds of thousands of events to millions). While all the underlying code is C++, the package has Python bindings for most methods. In addition, the (Python) DecayLanguage package that lives in SciKit (https://github.com/scikit-hep/decaylanguage) produces CUDA code for GooFit from AmpGen decay descriptor files (https://github.com/GooFit/AmpGen). GooFit sounds like RooFit and its user interface mimics that of RooFit in many ways. It runs on nVidia GPUs, under OpenMP on x86 servers, and on single CPUs (the last is useful for debugging). While GooFit has been used primarily for amplitude analyses, it can also be used effectively for coverage tests fitting simple one-dimensional functions, etc. I am very interested in using AD within GooFit. From preliminary discussions with experts, GooFit's architecture should allow us to use/adapt Clad (https://compiler-research.org/clad/) in a fairly straight-forward way. At the end of the day, we would like to make most of the functionality of GooFit available to users using Python interfaces that do not require developing new C++ code. It will be very interesting to see what a possible user community wants to do. |
I'm very interested in a jax-based statistical inference package, towards both binned and un-binned fits.
In my experience in attempting a jax port of the CMS Higgs combination, I found the many un-vectorized parameters we have becomes a debilitating JIT compilation bottleneck in jax. But this situation may have changed since I checked back in 2021. |
@nsmith- Is this better-scoped as a statistical modelling package, where one would find the appropriate abstraction that fits both binned/unbinned paradigms? Inference would just be extra layers on minimization, which I've already abstracted in |
@phinate yes! I guess your |
oh, I suppose so in a not-well-tested kind of way :) just asymptotic calcs though, and probably needs a quick going-through to truly be agnostic to the model representation but it is just a thin wrapper around jaxopt with HEP-like quantities/semantics! would be happy to build this out more to support whatever model abstraction we can come up with! |
Hi everyone, However, given the advancements in today's computational capabilities, I believe it might be beneficial to explore alternative approaches. For instance, we could consider precomputing the integrals and devising an efficient method for accessing these values as necessary. Another potential strategy could be experimenting with a Chi-squared (Chi2) fit with reduced granularity. Beyond these technical aspects, there's another issue I've been considering: the generalization and user-level accessibility of fitting tools. It often feels like we lack a consistent standard across fitting tools. For instance, finding a tool that effectively handles both B and D decays can be challenging. Similarly, analyzing decays of more than three bodies can become complex, often requiring custom or adapted code that can be hard to decipher. We need to address the readability of these codes and work towards creating user-level code that interfaces with the base code. Again, I bring up GooFit as an example - it does a great job of shielding the user from the intricacies of CUDA code to perform an analysis. Despite this, I find that there's room for improvement in the user experience, and I believe it would be fruitful for us to discuss these issues during the workshop. |
I fully agree! Is it an idea to organise a dedicated session for amplitude analysis (UX and documentation specifically)? If so, who would be interested? @JMolinaHN @mdsokoloff @jonas-eschle? |
@redeboer of course a discussion on amplitude analysis would be more than interesting! (in view of the latest results, I think we need it). From my point of view, I refuse to think that it can't be done a likelihood analysis in some decays like Dpipipi or Dkpipi. We all know those decays are challenging because of the pipi (in general, pp) but in some sense we should be adecuate (sensitive) to problems like that. |
+1 |
1 similar comment
+1 |
+1 |
This is basically what zfit already solves, it combines binned and unbinned (and mixed) fits. I think it's crucially more than relaxed, which allows to use histogram templates as an unbinned PDF (afaiu), but there is more to that: analytic shapes, numerical integration & sampling methods, arbitrary correlations etc. I also agree with the others, to point the three main topics that I see:
|
In this regard, zfit and RooFit are alone at the moment. What I would like to understand is how their representations of mixed binned-unbinned data compare/contrast. As an aside, Combine also can produce unbinned "pseudo-Asimov" datasets to take advantage of asymptotic methods. Is this something done elsewhere? (I am just ignorant here)
Curious about this! |
@nsmith- I'm curious to learn more about this. Is this in the docs? |
There is a brief discussion here http://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/part3/runningthetool/#asimov-datasets |
This topic seems perhaps too broad, and while I expect that during the week it will split out across different areas organically the areas that I think I'm most probable to spend time discussing are:
|
Extracting statistical results from data with systematics, correlations, etc. at large scale.
The text was updated successfully, but these errors were encountered: