Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
1987d0d
feat: Implement VannAbers calibration method for both binary and mult…
dimoibiehg Oct 17, 2025
a9ab37a
fix: type check for self.n_classes_ and self.classes_ in predict func…
dimoibiehg Oct 27, 2025
fabf0a9
fix: add two local variables to resolve type-checking issue for self.…
dimoibiehg Oct 27, 2025
97b55d6
fix: type checking for classes_ in fit function
dimoibiehg Oct 27, 2025
bb8ce97
fix: coverage for a new case of None classes_ in fit function
dimoibiehg Oct 27, 2025
bdb9f0e
Fix: misinformation in calib_size doctring
dimoibiehg Oct 28, 2025
469fb52
fix: improve doc string of calib_size
dimoibiehg Oct 28, 2025
5ffddbd
refactor: remove unnecessary cal_size in the calibrator
dimoibiehg Oct 28, 2025
5466ab6
refactor: apply black reformatter on new files
dimoibiehg Oct 28, 2025
90d4f95
Merge branch 'master' into master
dimoibiehg Oct 28, 2025
bf8c5ac
refactor: reformat some lines (to resolve some "make format" error)
dimoibiehg Oct 28, 2025
5fc834f
refactor: reformat multiple lines to pass "make format" command
dimoibiehg Oct 28, 2025
ef9b8e6
refactor: remove unnecessary import in the middile of the function
dimoibiehg Oct 28, 2025
4d93347
fix: a typo in doc string
dimoibiehg Oct 28, 2025
c8fcece
fix: a typo in doc string
dimoibiehg Oct 28, 2025
2a3956c
fix: a typo in doc string
dimoibiehg Oct 28, 2025
80d2405
fix: a typo in the doc string
dimoibiehg Oct 28, 2025
35f384a
fix: dimension of p_cal was mentioned wrongly in the doc string, in V…
dimoibiehg Oct 29, 2025
ce8d4ca
refactor: change Exceptions to ValueError when makes sense
dimoibiehg Oct 29, 2025
3c64f79
fix: remove setting global config of sklearn
dimoibiehg Oct 29, 2025
8b91f47
fix: handling any types of labels for classes other than indices star…
dimoibiehg Oct 30, 2025
772c79b
fix: format errors
dimoibiehg Oct 30, 2025
4dad2f6
fix: format error
dimoibiehg Oct 30, 2025
5a97e45
fix: format error (it was proposed by "black" formatter command)
dimoibiehg Oct 30, 2025
76d2755
fix: cover cv_ensemble config in the calibrator (missed from the base…
dimoibiehg Nov 13, 2025
793c06f
Merge branch 'master' into master
dimoibiehg Nov 13, 2025
fb8e473
refactor: resolve formatting issues
dimoibiehg Nov 13, 2025
009c629
Merge branch 'master' into master
dimoibiehg Nov 25, 2025
c45f071
refactor: minimize tests for VennAbersCalibrator and merge it into a …
dimoibiehg Dec 8, 2025
5e1d3f9
Merge branch 'master' into master
dimoibiehg Dec 8, 2025
380403b
refactor: improve formatting of the test_calibration.py
dimoibiehg Dec 8, 2025
b888343
fix: a bug in the usage of check_is_fitted and accordingly in `make t…
dimoibiehg Dec 10, 2025
790b7fb
fix: a minor issue in the format of the code
dimoibiehg Dec 10, 2025
9706e42
feat: implemented two examples of the Venn-Abers usage in binrary and…
dimoibiehg Dec 10, 2025
308162a
fix: a format problem in new examples
dimoibiehg Dec 10, 2025
c58008b
fix: resolve an error in making docs
dimoibiehg Dec 10, 2025
da933b7
Merge branch 'master' into master
allglc Dec 11, 2025
388794c
fix: improve examples to have more meaningful usage of the calibration.
dimoibiehg Dec 12, 2025
8f88772
refactor: improve code format of the examples
dimoibiehg Dec 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AUTHORS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ Contributors
* Faustin Pulvéric <[email protected]>
* Chaoqi Zhang <[email protected]>
* Leena Kamran Qidwai
* Omid Gheibi <[email protected]>
* Aman Vishnoi <[email protected]>
* Hannes Körner <HannesMK>
To be continued ...
2 changes: 2 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ History

1.x.x (2025-xx-xx)
------------------
* Introduce VennAbers calibrator both for binary and multiclass classification

* Remove dependency of internal classes on sklearn's check_is_fitted
* Add an example of risk control with LLM as a judge
* Add comparison with naive threshold in risk control quick start example
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""
=================================================
Calibrating binary classifier with Venn-ABERS
=================================================
This example shows how to calibrate a binary classifier with
:class:`~mapie.calibration.VennAbersCalibrator` and visualize the
impact on predicted probabilities.

We compare an uncalibrated model to its Venn-ABERS calibrated version
using reliability diagrams and Brier scores.
"""

from __future__ import annotations

import matplotlib.pyplot as plt
from sklearn.calibration import CalibrationDisplay
from sklearn.datasets import make_classification
from sklearn.metrics import brier_score_loss
from sklearn.model_selection import train_test_split

from mapie.calibration import VennAbersCalibrator

####################################################################
# 1. Build a miscalibrated binary classifier
# ---------------------------------------------------
# We generate a toy binary dataset and fit a random forest model
# which is known to be miscalibrated out of the box (produces
# probabilities too close to 0 or 1). We use a larger dataset to
# ensure sufficient data for proper calibration.

from sklearn.ensemble import RandomForestClassifier

X, y = make_classification(
n_samples=5000,
n_features=20,
n_informative=10,
n_redundant=2,
class_sep=0.8,
random_state=42,
)

# Split into train, calibration, and test sets
X_temp, X_test, y_temp, y_test = train_test_split(
X, y, test_size=0.3, random_state=42, stratify=y
)

X_train, X_calib, y_train, y_calib = train_test_split(
X_temp, y_temp, test_size=0.3, random_state=42, stratify=y_temp
)

# Use Random Forest which tends to be miscalibrated
base_model = RandomForestClassifier(n_estimators=100, max_depth=10, random_state=42)
base_model.fit(X_train, y_train)
probs_raw = base_model.predict_proba(X_test)[:, 1]
raw_brier = brier_score_loss(y_test, probs_raw)

####################################################################
# 2. Calibrate with Venn-ABERS
# ----------------------------
# We wrap the same base model in :class:`~mapie.calibration.VennAbersCalibrator`
# using the inductive mode (default). The calibrator uses the calibration set
# to learn a calibration mapping that will improve probability estimates.

va_calibrator = VennAbersCalibrator(
estimator=RandomForestClassifier(n_estimators=100, max_depth=10, random_state=42),
inductive=True,
random_state=42,
)
va_calibrator.fit(X_train, y_train, X_calib=X_calib, y_calib=y_calib)
probs_va = va_calibrator.predict_proba(X_test)[:, 1]
va_brier = brier_score_loss(y_test, probs_va)

####################################################################
# 3. Reliability diagrams and Brier scores
# ----------------------------------------
# Reliability diagrams show how predicted probabilities compare to
# observed frequencies. Perfect calibration lies on the diagonal.
# We also display Brier scores to quantify the improvement.

fig, axes = plt.subplots(1, 2, figsize=(12, 5))
CalibrationDisplay.from_predictions(
y_test,
probs_raw,
name=f"Uncalibrated (Brier={raw_brier:.3f})",
n_bins=10,
ax=axes[0],
)
CalibrationDisplay.from_predictions(
y_test,
probs_va,
name=f"Venn-ABERS (Brier={va_brier:.3f})",
n_bins=10,
ax=axes[1],
)
axes[0].set_title("Before calibration")
axes[1].set_title("After Venn-ABERS calibration")
plt.tight_layout()
plt.show()
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
"""
====================================================
Calibrating multi-class classifier with Venn-ABERS
====================================================
This example shows how to calibrate a multi-class classifier with
:class:`~mapie.calibration.VennAbersCalibrator` and visualize the
impact on predicted probabilities. We compare an uncalibrated model
against its Venn-ABERS calibrated version using reliability diagrams
and multi-class Brier scores.
"""

from __future__ import annotations

import matplotlib.pyplot as plt
import numpy as np
from sklearn.calibration import calibration_curve
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import label_binarize

from mapie.calibration import VennAbersCalibrator

####################################################################
# 1. Build a miscalibrated multi-class classifier
# -----------------------------------------------
# We generate a 3-class dataset and fit a random forest model,
# which is known to be miscalibrated out of the box.

from sklearn.ensemble import RandomForestClassifier

X, y = make_classification(
n_samples=5000,
n_features=20,
n_informative=12,
n_redundant=2,
n_classes=3,
n_clusters_per_class=1,
class_sep=0.8,
random_state=7,
)

classes = np.unique(y)
# Split into train, calibration, and test sets
X_temp, X_test, y_temp, y_test = train_test_split(
X, y, test_size=0.3, random_state=7, stratify=y
)

X_train, X_calib, y_train, y_calib = train_test_split(
X_temp, y_temp, test_size=0.3, random_state=7, stratify=y_temp
)

base_model = RandomForestClassifier(n_estimators=100, max_depth=10, random_state=7)
base_model.fit(X_train, y_train)
probs_raw = base_model.predict_proba(X_test)

####################################################################
# 2. Calibrate with Venn-ABERS
# ----------------------------
# The calibrator refits the base model internally and learns a mapping
# from the held-out calibration set. Venn-ABERS natively supports
# multi-class problems.

va_calibrator = VennAbersCalibrator(
estimator=RandomForestClassifier(n_estimators=100, max_depth=10, random_state=7),
inductive=True,
random_state=7,
)
va_calibrator.fit(X_train, y_train, X_calib=X_calib, y_calib=y_calib)
probs_va = va_calibrator.predict_proba(X_test)

####################################################################
# 3. Multi-class Brier score helper
# ---------------------------------
# We compute the mean squared error between predicted probabilities and
# one-hot encoded labels.


def multiclass_brier(y_true: np.ndarray, proba: np.ndarray) -> float:
y_onehot = label_binarize(y_true, classes=classes)
return float(np.mean(np.sum((y_onehot - proba) ** 2, axis=1)))


brier_raw = multiclass_brier(y_test, probs_raw)
brier_va = multiclass_brier(y_test, probs_va)

####################################################################
# 4. Reliability diagrams and Brier scores
# ----------------------------------------
# We plot one-vs-rest reliability curves for each class before and after
# calibration. Lower Brier score indicates better calibration.

fig, axes = plt.subplots(1, 2, figsize=(12, 5))
for cls in classes:
y_true_cls = (y_test == cls).astype(int)
prob_raw_cls = probs_raw[:, cls]
prob_va_cls = probs_va[:, cls]

frac_pos_raw, mean_pred_raw = calibration_curve(
y_true_cls, prob_raw_cls, n_bins=10, strategy="uniform"
)
frac_pos_va, mean_pred_va = calibration_curve(
y_true_cls, prob_va_cls, n_bins=10, strategy="uniform"
)

axes[0].plot(mean_pred_raw, frac_pos_raw, marker="o", label=f"class {cls}")
axes[1].plot(mean_pred_va, frac_pos_va, marker="o", label=f"class {cls}")

for ax, title in zip(
axes,
[
f"Before calibration (Brier={brier_raw:.3f})",
f"After Venn-ABERS (Brier={brier_va:.3f})",
],
):
ax.plot([0, 1], [0, 1], "k--", linewidth=1)
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.set_xlabel("Mean predicted probability")
ax.set_ylabel("Fraction of positives")
ax.set_title(title)
ax.grid(True)
ax.legend()

plt.tight_layout()
plt.show()
Loading