PWK#

class mlquantify.neighbors.PWK(alpha=1, n_neighbors=10, algorithm='auto', metric='euclidean', leaf_size=30, p=2, metric_params=None, n_jobs=None)[source]#

Probabilistic Weighted k-Nearest Neighbor (PWK) Quantifier.

This quantifier leverages the PWKCLF classifier to perform quantification by estimating class prevalences through a probabilistically weighted k-nearest neighbor approach.

The method internally uses a weighted k-NN classifier where neighbors’ contributions are adjusted by class-specific weights designed to correct for class imbalance, controlled by the hyperparameter alpha.

Parameters:
alphafloat, default=1

Imbalance correction exponent for class weights. Higher values increase the influence of minority classes.

n_neighborsint, default=10

Number of nearest neighbors considered.

algorithm{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, default=’auto’

Algorithm used to compute nearest neighbors.

metricstr, default=’euclidean’

Distance metric for nearest neighbor search.

leaf_sizeint, default=30

Leaf size for tree-based neighbors algorithms.

pint, default=2

Power parameter for the Minkowski metric (when metric=’minkowski’).

metric_paramsdict or None, default=None

Additional parameters for the metric function.

n_jobsint or None, default=None

Number of parallel jobs for neighbors search.

Attributes:
ccobject

Internally used Classify & Count quantifier wrapping PWKCLF.

learnerPWKCLF

Underlying probabilistic weighted k-NN classifier.

Examples

>>> q = PWK(alpha=1.5, n_neighbors=5)
>>> q.fit(X_train, y_train)
>>> prevalences = q.predict(X_test)
classify(X)[source]#

Classify samples using the underlying learner.

Parameters:
Xarray-like of shape (n_samples, n_features)

Features to classify.

Returns:
labelsarray of shape (n_samples,)

Predicted class labels.

fit(X, y)[source]#

Fit the PWK quantifier to the training data.

Parameters:
X_trainarray-like of shape (n_samples, n_features)

Training features.

y_trainarray-like of shape (n_samples,)

Training labels.

Returns:
selfobject

The fitted instance.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)[source]#

Predict prevalences for the given data.

Parameters:
Xarray-like of shape (n_samples, n_features)

Features for which to predict prevalences.

Returns:
prevalencesarray of shape (n_classes,)

Predicted class prevalences.

save_quantifier(path: str | None = None) None[source]#

Save the quantifier instance to a file.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.