HDy#
- class mlquantify.mixture.HDy(learner=None, strategy='ovr')[source]#
Hellinger Distance Minimization (HDy) quantification method.
Estimates class prevalences by finding mixture weights that minimize the Hellinger distance between the histogram of test scores and the mixture of positive and negative class score histograms, evaluated over multiple bin sizes.
- Parameters:
- learnerestimator, optional
Base probabilistic classifier.
References
[2] Esuli et al. (2023). Learning to Quantify. Springer.
- best_mixture(predictions, pos_scores, neg_scores)[source]#
Determine the best mixture parameters for the given data.
Compute the mixture weight \(\alpha\) that minimizes the Hellinger distance between the test score histogram and the mixture of positive and negative class score histograms.
The mixture weight \(\alpha\) is estimated as: .. math:
\alpha = \arg \min_{\alpha \in [0, 1]} Hellinger \left( H_{test}, \alpha H_{pos} + (1 - \alpha) H_{neg} \right)
where \(H\) denotes histograms.
- Parameters:
- predictionsndarray
Classifier scores for the test data.
- pos_scoresndarray
Classifier scores for the positive class from training data.
- neg_scoresndarray
Classifier scores for the negative class from training data.
- Returns:
- alphafloat
Estimated mixture weight.
- best_distancefloat
Distance corresponding to the best mixture weight.
- get_best_distance(*args, **kwargs)[source]#
Get the best distance value from the mixture fitting process.
Notes
If the quantifier has not been fitted yet, it will fit the model for getting the best distance.
- classmethod get_distance(dist_train, dist_test, measure='hellinger')[source]#
Compute distance between two distributions.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.