.. _sphx_confidence_regions:

================================
Confidence intervals and regions
================================

A prevalence estimate is more useful with an honest measure of its uncertainty.
Bootstrapping the test sample — predicting on many resamples — turns a single
point estimate into a *distribution* of estimates, from which we can read a
confidence interval (binary) or a confidence region on the simplex
(multiclass).

Binary: a bootstrap confidence interval
---------------------------------------

We resample the test set, re-predict each time, and summarise the spread with a
percentile interval via
:func:`~mlquantify.confidence.construct_confidence_region`.

.. plot::

    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.datasets import make_classification
    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import train_test_split

    from mlquantify.likelihood import EMQ
    from mlquantify.confidence import construct_confidence_region

    X, y = make_classification(
        n_samples=4000, n_features=20, weights=[0.5, 0.5], random_state=0,
    )
    X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.5, random_state=0)

    q = EMQ(LogisticRegression(max_iter=1000)).fit(X_tr, y_tr)

    rng = np.random.default_rng(0)
    boot = []
    for _ in range(300):
        idx = rng.choice(len(X_te), len(X_te), replace=True)
        pred = q.predict(X_te[idx])
        boot.append([pred[c] for c in q.classes_])
    boot = np.array(boot)

    region = construct_confidence_region(
        boot, confidence_level=0.95, method="intervals",
    )
    low, high = region.get_region()
    point = boot.mean(axis=0)
    true = np.array([(y_te == c).mean() for c in q.classes_])

    fig, ax = plt.subplots(figsize=(6, 4))
    xpos = np.arange(len(q.classes_))
    ax.errorbar(
        xpos, point, yerr=[point - low, high - point],
        fmt="o", capsize=6, color="#2a9d8f", label="estimate ± 95% CI",
    )
    ax.plot(xpos, true, "D", color="#e63946", label="true")
    ax.set_xticks(xpos)
    ax.set_xticklabels([f"class {c}" for c in q.classes_])
    ax.set_ylabel("Prevalence")
    ax.set_ylim(0, 1)
    ax.set_title("Bootstrap 95% confidence interval (EMQ)")
    ax.legend()
    fig.tight_layout()

Three classes: a confidence region on the simplex
--------------------------------------------------

With three classes the estimate lives on a 2-simplex, and the natural
uncertainty summary is a confidence *ellipse*.
:class:`~mlquantify.visualization.ConfidenceRegionDisplay` draws the bootstrap
cloud and its ellipse directly.

.. plot::

    import numpy as np
    from sklearn.datasets import make_classification
    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import train_test_split

    from mlquantify.likelihood import EMQ
    from mlquantify.visualization import ConfidenceRegionDisplay

    X, y = make_classification(
        n_samples=4500, n_features=20, n_informative=6,
        n_classes=3, n_clusters_per_class=1, random_state=0,
    )
    X_tr, X_te, y_tr, y_te = train_test_split(X, y, test_size=0.5, random_state=0)

    q = EMQ(LogisticRegression(max_iter=1000)).fit(X_tr, y_tr)

    rng = np.random.default_rng(0)
    boot = []
    for _ in range(400):
        idx = rng.choice(len(X_te), len(X_te), replace=True)
        pred = q.predict(X_te[idx])
        boot.append([pred[c] for c in q.classes_])
    boot = np.array(boot)
    true = np.array([(y_te == c).mean() for c in q.classes_])

    ConfidenceRegionDisplay.from_estimates(
        boot, confidence_level=0.95,
        class_names=["A", "B", "C"], true_prevalence=true,
        color="#1d3557", alpha=0.25,
    )

The ellipse shows both the *size* of the uncertainty (area) and its
*orientation* (which classes trade off against each other). If the true point
falls inside the 95% region, the quantifier's uncertainty estimate is
well-calibrated.

.. seealso::

   - :class:`~mlquantify.meta.AggregativeBootstrap` — bootstrap confidence
     regions without writing the resampling loop.
   - :ref:`confidence_intervals` — the methods and their assumptions.