FM#
- class mlquantify.counting.FM(estimator=None, loss='ls', solver='slsqp', cv=None, stratified=True, shuffle=False, random_state=None)[source]#
Friedman Method (FM).
A soft adjustment method that binarizes posterior probabilities by comparing them against class priors learned during training, rather than a fixed 0.5 threshold. This makes the method more robust to skewed class distributions.
Uses the same linear-system framework as
GPACCbut applies a class-prior-aware binarization transform before aggregation.- Parameters:
- estimatorestimator, optional
A classifier with
fitandpredict_probamethods.- lossstr, default=’ls’
Loss function for the constrained system.
- solverstr, default=’slsqp’
Optimization solver.
- cvint or None, default=None
Number of cross-validation folds.
- stratifiedbool, default=True
Stratified CV flag.
- shufflebool, default=False
Shuffle flag.
- random_stateint or None, default=None
Random seed.
- Attributes:
- estimator_estimator
The fitted underlying classifier.
- classes_ndarray of shape (n_classes,)
Class labels seen during
fit.
References
References
[1]Friedman, J. (2015). Detecting and Dealing with Concept Drift. Technical Report.
[2]Tasche, D. (2024). Comments on Friedman’s Method for Class Distribution Estimation. LQ 2024 Workshop Proceedings.
Examples
>>> from mlquantify.counting import FM >>> from sklearn.linear_model import LogisticRegression >>> from sklearn.datasets import make_classification >>> X, y = make_classification(n_samples=200, n_classes=3, n_informative=5, ... n_redundant=0, random_state=42) >>> q = FM(estimator=LogisticRegression()).fit(X, y) >>> q.predict(X) {0: 0.33, 1: 0.34, 2: 0.33}
- aggregate(test_representation, train_representation=None, train_labels=None, classes=None)[source]#
Aggregate a pre-computed test representation into prevalences.
Allows calling the compose quantifier’s solver directly when the representation has already been computed externally, bypassing the estimator step.
- Parameters:
- test_representationarray-like of shape (representation_dim,)
Pre-computed test representation vector.
- train_representationarray-like of shape (n_samples, representation_dim) or None, default=None
Training representation. If provided together with
train_labels, the representation is re-fitted.- train_labelsarray-like of shape (n_samples,) or None, default=None
Training labels used to re-fit the representation when
train_representationis also provided.- classesarray-like of shape (n_classes,) or None, default=None
Class labels to use. Inferred from
train_labelswhenNone.
- Returns:
- prevalencesdict or ndarray of shape (n_classes,)
Estimated class prevalences.
- fit(X, y, estimator_fitted=False, sample_weight=None, cv_prediction='refit')[source]#
Fit the compose quantifier.
Trains the estimator (when
aggregative=True) via cross-validation to obtain OOF predictions, then fits the configured representation on those predictions.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Training feature matrix.
- yarray-like of shape (n_samples,)
Training class labels.
- estimator_fittedbool, default=False
If
True, skip fitting the estimator (assume it is already fitted and useXdirectly as predictions).- sample_weightarray-like of shape (n_samples,) or None, default=None
Per-sample weights forwarded to the representation’s fit.
- cv_prediction{‘refit’, ‘ensemble’}, default=’refit’
Cross-validation prediction strategy.
- Returns:
- selfBaseComposeQuantifier
The fitted quantifier.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)[source]#
Predict class prevalences for the test set.
Applies the estimator (when
aggregative=True), transforms the predictions into the representation space, and solves for the prevalence vector.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test feature matrix.
- Returns:
- prevalencesdict or ndarray of shape (n_classes,)
Estimated class prevalences summing to 1.
- set_fit_request(*, cv_prediction: bool | None | str = '$UNCHANGED$', estimator_fitted: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') FM[source]#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
- cv_predictionstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
cv_predictionparameter infit.- estimator_fittedstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
estimator_fittedparameter infit.- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter infit.
- Returns:
- selfobject
The updated object.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.