ConfidenceInterval#

class mlquantify.confidence.ConfidenceInterval(prev_estims, confidence_level=0.95)[source]#

Bootstrap confidence intervals for each class prevalence.

Constructs independent percentile-based confidence intervals for each class dimension from bootstrap samples.

The confidence region is defined as:

\[\begin{split}CI_{\alpha}(\pi) = \begin{cases} 1 & \text{if } L_i \le \pi_i \le U_i, \forall i=1,\dots,n \\ 0 & \text{otherwise} \end{cases}\end{split}\]

where \(L_i\) and \(U_i\) are the empirical \(\alpha/2\) and \(1-\alpha/2\) quantiles for class \(i\).

Parameters:
prev_estimsarray-like of shape (m, n)

Bootstrap prevalence estimates.

confidence_levelfloat, default=0.95

Desired confidence level.

Attributes:
I_lowndarray of shape (n,)

Lower confidence bounds.

I_highndarray of shape (n,)

Upper confidence bounds.

References

[1] Moreo, A., & Salvati, N. (2025).

An Efficient Method for Deriving Confidence Intervals in Aggregative Quantification. Section 3.3, Equation (1).

Examples

>>> X = np.random.dirichlet(np.ones(3), size=200)
>>> ci = ConfidenceInterval(X, confidence_level=0.9)
>>> ci.get_region()
(array([0.05, 0.06, 0.05]), array([0.48, 0.50, 0.48]))
>>> ci.contains([0.3, 0.4, 0.3])
array([[ True]])
contains(point)[source]#

Check whether a prevalence vector falls inside the interval.

Parameters:
pointarray-like of shape (n_classes,)

The prevalence vector to test.

Returns:
insidendarray of shape (1, 1)

True if every component of point is within the corresponding confidence interval.

get_point_estimate()[source]#

Return the mean of the bootstrap prevalence estimates.

Returns:
meanndarray of shape (n_classes,)

Mean prevalence across bootstrap samples.

get_region()[source]#

Return the lower and upper confidence bounds.

Returns:
I_lowndarray of shape (n_classes,)

Lower percentile bound for each class.

I_highndarray of shape (n_classes,)

Upper percentile bound for each class.