MatchingHistogramQuantifier#
- class mlquantify.matching.MatchingHistogramQuantifier(bins_size, distance='hellinger', solver='auto', strategy='ovr', histogram_features=None, bin_strategy=None, laplace_smoothing=False)[source]#
Abstract base class for histogram-based distribution matching.
Subclasses learn class-conditional histogram representations from training data and estimate the test prevalence by finding the mixture of those histograms that best matches the test histogram.
This is a binary-only method. When applied to multiclass problems, a one-vs-rest (OvR) strategy is applied automatically.
- Parameters:
- bins_sizeint or array-like
Number of histogram bins, or array of bin counts to sweep over.
- distancestr, default=’hellinger’
Distance function used to compare histograms.
- solverstr, default=’auto’
Optimization solver;
'auto'selects based on the distance.- strategy{‘ovr’, ‘ovo’}, default=’ovr’
Multiclass decomposition strategy.
- histogram_featuresint or None, default=None
Number of score columns used to build histograms.
- bin_strategystr or None, default=None
Aggregation strategy across bin sizes (
'median'or'mean').
- Attributes:
- classes_ndarray of shape (n_classes,)
Class labels seen during
fit.
Examples
>>> from mlquantify.matching._histogram import MatchingHistogramQuantifier >>> from sklearn.datasets import make_classification >>> import numpy as np >>> class MyHistQ(MatchingHistogramQuantifier): ... def __init__(self): ... super().__init__(bins_size=10) ... def fit(self, X, y): ... self.classes_ = np.unique(y) ... return self._fit(X, y) ... def predict(self, X): ... return self._predict(X) >>> X, y = make_classification(n_samples=200, random_state=42) >>> MyHistQ().fit(X, y).predict(X) {0: 0.5, 1: 0.5}
- get_distance(dist_train, dist_test, distance='hellinger')[source]#
Compute a distance between two normalized representations.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.