9. Calibration#

Well-calibrated posterior probabilities improve probabilistic quantifiers such as EMQ. The mlquantify.calibration subpackage provides post-hoc calibration of classifier posteriors through the scaling family of methods.

9.1. Classifier calibration#

ClassifierCalibrator rescales a classifier’s logits to minimise the negative log-likelihood of a held-out set. Four methods are available via the method argument:

method

Description

'ts'

Temperature Scaling – one shared temperature T (Guo et al., 2017).

'bcts'

Bias-Corrected Temperature ScalingT plus per-class biases (Alexandari, Kundaje & Shrikumar, 2020). A strong default.

'vs'

Vector Scaling – per-class weights and biases (Guo et al., 2017).

'nbvs'

No-Bias Vector Scaling – per-class weights only (Alexandari et al., 2020).

The calibrator follows the scikit-learn fit(y_true, y_pred) / predict convention. y_pred may be probabilities (input_type='proba', the default) or raw logits (input_type='logits'); predict always returns calibrated probabilities.

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from mlquantify.calibration import ClassifierCalibrator

X_tr, X_cal, y_tr, y_cal = train_test_split(X, y, test_size=0.3)
clf = LogisticRegression().fit(X_tr, y_tr)

# Fit the calibrator on held-out predictions, never on the training set.
cal = ClassifierCalibrator(method="bcts").fit(y_cal, clf.predict_proba(X_cal))
calibrated = cal.predict(clf.predict_proba(X_test))

Note

Calibration must be fit on predictions held out from classifier training (a validation split or cross-validated predictions); fitting it on the training predictions under-estimates the miscalibration.

9.2. Use with EMQ#

EMQ can apply calibration internally before the EM loop – pass calib_function='bcts' (or 'ts' / 'vs' / 'nbvs'):

from mlquantify.likelihood import EMQ
from sklearn.linear_model import LogisticRegression

emq = EMQ(LogisticRegression(), calib_function="bcts").fit(X_train, y_train)
prevalence = emq.predict(X_test)

9.3. Quantifier calibration#

QuantifierCalibrator is reserved for post-hoc calibration of quantifier outputs and is not implemented yet (its methods raise NotImplementedError). Use ClassifierCalibrator to calibrate the posteriors a quantifier consumes.

9.4. Custom calibrators#

Subclass Calibrator to implement your own strategy:

from mlquantify.calibration import Calibrator

class MyCalibrator(Calibrator):
    def fit(self, y_true, y_pred):
        return self

    def predict(self, y_pred):
        return y_pred

9.5. References#

  • Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On Calibration of Modern Neural Networks. ICML.

  • Alexandari, A., Kundaje, A., & Shrikumar, A. (2020). Maximum Likelihood with Bias-Corrected Calibration is Hard-to-Beat. ICML.