SMM#
- class mlquantify.mixture.SMM(learner=None, strategy='ovr')[source]#
Sample Mean Matching (SMM) quantification method.
Estimates class prevalence by matching the mean score of the test samples to a convex combination of positive and negative training scores. The mixture weight \(\alpha\) is computed as:
\[\alpha = \frac{\bar{s}_{test} - \bar{s}_{neg}}{\bar{s}_{pos} - \bar{s}_{neg}}\]where \(\bar{s}\) denotes the sample mean.
- Parameters:
- learnerestimator, optional
Base probabilistic classifier.
References
[2] Esuli et al. (2023). Learning to Quantify. Springer.
- best_mixture(predictions, pos_scores, neg_scores)[source]#
Determine the best mixture parameters for the given data.
- get_best_distance(*args, **kwargs)[source]#
Get the best distance value from the mixture fitting process.
Notes
If the quantifier has not been fitted yet, it will fit the model for getting the best distance.
- classmethod get_distance(dist_train, dist_test, measure='hellinger')[source]#
Compute distance between two distributions.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.