.. _sphx_error_by_shift: ===================================== Robustness to prior-probability shift ===================================== The diagonal plot in :ref:`sphx_method_comparison` shows *where* a quantifier errs; this example collapses that into a single, comparable curve: quantification **error as a function of the amount of prior-probability shift** between the test sample and the training set. A flat, low curve is the goal — it means the method is insensitive to how far the test prevalence has drifted. We use :class:`~mlquantify.visualization.ErrorByShiftDisplay`, which bins the protocol samples by their shift and draws the mean error with a ``±std`` band. .. plot:: import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from mlquantify.counting import CC, ACC from mlquantify.likelihood import EMQ from mlquantify.model_selection import apply_protocol from mlquantify.visualization import ErrorByShiftDisplay X, y = make_classification( n_samples=4000, n_features=20, weights=[0.5, 0.5], random_state=0, ) _, counts = np.unique(y, return_counts=True) train_prevalence = counts / counts.sum() methods = { "CC": (CC(LogisticRegression(max_iter=1000)), "#e76f51"), "ACC": (ACC(LogisticRegression(max_iter=1000)), "#2a9d8f"), "EMQ": (EMQ(LogisticRegression(max_iter=1000)), "#264653"), } fig, ax = plt.subplots(figsize=(7, 4.5)) for name, (q, color) in methods.items(): results = apply_protocol( q, X, y, protocol="upp", n_prevalences=300, batch_size=100, random_state=0, ) ErrorByShiftDisplay.from_predictions( results["true_prevalences"], results["predicted_prevalences"], train_prevalence=train_prevalence, error_metric="ae", n_bins=10, name=name, ax=ax, color=color, ) ax.set_title("Absolute error vs. prior-probability shift") fig.tight_layout() CC's error grows steadily as the shift increases — exactly the bias from :ref:`sphx_cc_under_shift`, now quantified — while ACC and EMQ stay low and flat across the whole range. This is the plot to reach for when you need to *defend* a method choice: it summarises hundreds of test samples into one honest picture of robustness. .. seealso:: - :ref:`sphx_method_comparison` — the per-sample scatter behind these curves. - :class:`~mlquantify.visualization.ErrorByShiftDisplay` — options and metrics.