EMQ#
- class mlquantify.likelihood.EMQ(learner=None, tol=0.0001, max_iter=100, calib_function=None, criteria=<function MAE>)[source]#
Expectation-Maximization Quantifier (EMQ).
Estimates class prevalences under prior probability shift by alternating between expectation (E) and maximization (M) steps on posterior probabilities.
E-step: .. math:
p_i^{(s+1)}(x) = \frac{q_i^{(s)} p_i(x)}{\sum_j q_j^{(s)} p_j(x)}
M-step: .. math:
q_i^{(s+1)} = \frac{1}{N} \sum_{n=1}^N p_i^{(s+1)}(x_n)
where - \(p_i(x)\) are posterior probabilities predicted by the classifier - \(q_i^{(s)}\) are class prevalence estimates at iteration \(s\) - \(N\) is the number of test instances.
Calibrations supported on posterior probabilities before EM iteration:
Temperature Scaling (TS): .. math:
\hat{p} = \text{softmax}\left(\frac{\log(p)}{T}\right)
Bias-Corrected Temperature Scaling (BCTS): .. math:
\hat{p} = \text{softmax}\left(\frac{\log(p)}{T} + b\right)
Vector Scaling (VS): .. math:
\hat{p}_i = \text{softmax}(W_i \cdot \log(p_i) + b_i)
No-Bias Vector Scaling (NBVS): .. math:
\hat{p}_i = \text{softmax}(W_i \cdot \log(p_i))
- Parameters:
- learnerestimator, optional
Probabilistic classifier supporting predict_proba.
- tolfloat, default=1e-4
Convergence threshold.
- max_iterint, default=100
Maximum EM iterations.
- calib_functionstr or callable, optional
Calibration method: - ‘ts’: Temperature Scaling - ‘bcts’: Bias-Corrected Temperature Scaling - ‘vs’: Vector Scaling - ‘nbvs’: No-Bias Vector Scaling - callable: custom calibration function
- criteriacallable, default=MAE
Convergence metric.
References
[1]Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the Outputs of a Classifier to New a Priori Probabilities. Neural Computation, 14(1), 2141-2156.
[2]Esuli, A., Moreo, A., & Sebastiani, F. (2023). Learning to Quantify. Springer.
- classmethod EM(posteriors, priors, tolerance=1e-06, max_iter=100, criteria=<function MAE>)[source]#
Static method implementing the EM algorithm for quantification.
- Parameters:
- posteriorsndarray of shape (n_samples, n_classes)
Posterior probability predictions.
- priorsndarray of shape (n_classes,)
Training class prior probabilities.
- tolerancefloat
Convergence threshold based on difference between iterations.
- max_iterint
Max number of EM iterations.
- criteriacallable
Metric to assess convergence, e.g., MAE.
- Returns:
- qsndarray of shape (n_classes,)
Estimated test set class prevalences.
- psndarray of shape (n_samples, n_classes)
Updated soft membership probabilities per instance.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.