ComposeQuantifier#

class mlquantify.compose.ComposeQuantifier(representation, loss, solver=None, seed=None)[source]#

Generic quantification method based on constrained regression using the QUnfold framework.

This implementation wraps the qunfold backend while allowing the use of:

native qunfold representations and losses,
custom representations implemented in mlquantify,
arbitrary distance functions, such as Topsoe or Hellinger.

Parameters:

representationobject: Representation object defining how to compute \(q\) and \(M\). Can be either a qunfold representation or a custom implementation.
lossobject or callable: Loss or distance function \(D(p, q)\). Can be either a qunfold loss object or a callable accepting two distributions.
solverstr or callable, optional: Solver used for the constrained optimization problem. If not provided, the default solver from qunfold is used.
seedint, optional: Random seed used by the qunfold backend for reproducible optimization.

Notes

This formulation unifies several quantification methods:

AC / CC: class-based representations
Prob / PCC: probability-based representations
HDy / DyS: histogram-based representations with divergence measures
HDx: feature-based histogram representations

The method follows the constrained regression framework described in:

\[y = X \hat{\pi}_F\]

where different choices of representation and loss correspond to different quantification algorithms.

References

[1]

Firat, A. (2016). Unified Framework for Quantification.

Examples

Using a class-based representation to implement an ACC-like quantifier:

>>> from sklearn.linear_model import LogisticRegression
>>> from qunfold.sklearn import CVClassifier
>>> from qunfold.methods.linear.losses import LeastSquaresLoss
>>> from qunfold.methods.linear.representations import ClassRepresentation
>>> from mlquantify.compose import ComposeQuantifier
>>>
>>> learner = LogisticRegression(max_iter=1000)
>>> quantifier = ComposeQuantifier(
...     representation=ClassRepresentation(
...         CVClassifier(learner),
...         is_probabilistic=False,
...     ),
...     loss=LeastSquaresLoss(),
... )
>>> quantifier.fit(X_train, y_train)
>>> prevalences = quantifier.predict(X_test)

Using a histogram-based representation with a custom Topsoe distance from mlquantify to implement a DyS-like quantifier:

>>> from sklearn.linear_model import LogisticRegression
>>> from qunfold.sklearn import CVClassifier
>>> from qunfold.methods.linear.representations import (
...     ClassRepresentation,
...     HistogramRepresentation,
... )
>>> from mlquantify.compose import ComposeQuantifier
>>> from mlquantify.metrics import topsoe_jax
>>>
>>> learner = LogisticRegression(max_iter=1000)
>>> representation = HistogramRepresentation(
...     n_bins=8,
...     preprocessor=ClassRepresentation(
...         CVClassifier(learner),
...         is_probabilistic=True,
...     ),
...     unit_scale=False,
... )
>>> quantifier = ComposeQuantifier(
...     representation=representation,
...     loss=topsoe_jax,
... )
>>> quantifier.fit(X_train, y_train)
>>> prevalences = quantifier.predict(X_test)

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

paramsdict: Parameter names mapped to their values.

save_quantifier(path: str | None = None) → None[source]#: Save the quantifier instance to a file.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**paramsdict: Estimator parameters.

Returns:

selfestimator instance: Estimator instance.