QuaDapt#

class mlquantify.meta.QuaDapt(quantifier, measure='topsoe', merging_factors=array([0.1, 0.3, 0.5, 0.7, 0.9]))[source]#

QuaDapt Metaquantifier: Adaptive quantification using synthetic scores.

This metaquantifier improves prevalence estimation by merging training samples with different score distributions using a merging factor :math: ( m ). It evaluates candidate merging factors, chooses the best by minimizing a distribution distance metric (Hellinger, Topsoe, ProbSymm, or SORD), and aggregates quantification accordingly.

Parameters:
quantifierBaseQuantifier

The base quantifier model to adapt.

measure{‘hellinger’, ‘topsoe’, ‘probsymm’, ‘sord’}, default=’topsoe’

The distribution distance metric used to select the best merging factor.

merging_factorsarray-like

Candidate merging factor values to evaluate.

Examples

>>> from mlquantify.meta import QuaDapt
>>> from mlquantify.adjust_counting import ACC
>>> from sklearn.ensemble import RandomForestClassifier
>>> quadapt_acc = QuaDapt(
...     quantifier=ACC(RandomForestClassifier()), 
...     merging_factor=[0.1, 0.5, 1.0], 
...     measure='sord'
... )
>>> quadapt_acc.fit(X_train, y_train)
>>> prevalence = quadapt_acc.predict(X_test)
classmethod MoSS(n, alpha, m)[source]#

Model for Score Simulation

Parameters:
nint

Number of observations.

alphafloat

Class proportion, which defines the prevalence of the positive class.

mfloat

Merging factor, which controls the overlap between positive and negative score distributions.

Returns:
tuple

Tuple of score and label arrays.

\[\mathrm{moss}(n, \alpha, \mathfrak{m}) = \mathrm{syn}(\oplus, \lfloor \alpha n \rfloor, \mathfrak{m}) \cup \mathrm{syn}(\ominus , \lfloor (1 - \alpha) n \rfloor, \mathfrak{m})\]

Notes

The MoSS generates only binary scores, simulating positive and negative class scores.

References

[1]

Maletzke, A., Reis, D. dos, Hassan, W., & Batista, G. (2021).

Accurately Quantifying under Score Variability. 2021 IEEE International Conference on Data Mining (ICDM), 1228-1233. https://doi.org/10.1109/ICDM51629.2021.00149

Examples

>>> scores = QuaDapt.MoSS(n=1000, alpha=0.3, m=0.5)
>>> print(scores.shape)
(1000, 3)
get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

save_quantifier(path: str | None = None) None[source]#

Save the quantifier instance to a file.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.