HDx#

class mlquantify.mixture.HDx(bins_size=None, strategy='ovr')[source]#

Hellinger Distance-based Quantifier (HDx).

A non-aggregative mixture quantifier that estimates class prevalences by minimizing the average Hellinger distance between class-wise feature histograms of training data and test data. It iterates over mixture weights and histogram bin sizes, evaluating distance per feature and aggregates the results.

Parameters:
bins_sizearray-like, optional

Histogram bin sizes to consider for discretizing features.

strategy{‘ovr’, ‘ovo’}, default=’ovr’

Multiclass quantification strategy.

Attributes:
pos_featuresndarray

Training samples of the positive class.

neg_featuresndarray

Training samples of the negative class.

References

[2] Esuli et al. (2023). Learning to Quantify. Springer.

best_mixture(X, pos, neg)[source]#

Determine the best mixture parameters for the given data.

fit(X, y, *args, **kwargs)[source]#

Fit the quantifier using the provided data and learner.

get_best_distance(*args, **kwargs)[source]#

Get the best distance value from the mixture fitting process.

Notes

If the quantifier has not been fitted yet, it will fit the model for getting the best distance.

classmethod get_distance(dist_train, dist_test, measure='hellinger')[source]#

Compute distance between two distributions.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X, *args, **kwargs)[source]#

Predict class prevalences for the given data.

save_quantifier(path: str | None = None) None[source]#

Save the quantifier instance to a file.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.