PWK#

class mlquantify.neighbors.PWK(alpha=1, n_neighbors=10, algorithm='auto', metric='euclidean', leaf_size=30, p=2, metric_params=None, n_jobs=None)[source]#

Probabilistic Weighted k-Nearest Neighbour (PWK) quantifier.

Estimates class prevalences using a k-nearest neighbour classifier with class-imbalance-aware weighting. Each neighbour’s contribution is scaled by a class-specific weight that corrects for the imbalance between class sizes, controlled by the alpha exponent.

Parameters:
alphafloat, default=1

Imbalance correction exponent. Higher values amplify the influence of minority-class neighbours.

n_neighborsint, default=10

Number of nearest neighbours considered for each test instance.

algorithm{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, default=’auto’

Algorithm used to find nearest neighbours.

metricstr, default=’euclidean’

Distance metric for the neighbour search.

leaf_sizeint, default=30

Leaf size for tree-based algorithms.

pint, default=2

Power parameter for the Minkowski metric.

metric_paramsdict or None, default=None

Additional keyword arguments for the metric function.

n_jobsint or None, default=None

Number of parallel jobs for the neighbour search.

Attributes:
estimatorPWKCLF

Fitted underlying weighted k-NN classifier.

classes_ndarray of shape (n_classes,)

Class labels seen during fit.

References

References
[1]

Barranquero, J., Díez, J., & del Coz, J. J. (2013). Quantification-Oriented Learning Based on Reliable Classifiers. Pattern Recognition, 48(2), 591–604.

Examples

>>> from mlquantify.neighbors import PWK
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=200, random_state=42)
>>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y)
>>> q.predict(X)
{0: 0.49, 1: 0.51}
classify(X)[source]#

Classify test instances using the underlying weighted k-NN estimator.

Returns hard class labels produced by PWKCLF without any prevalence-level aggregation.

Parameters:
Xarray-like of shape (n_samples, n_features)

Test feature matrix.

Returns:
labelsndarray of shape (n_samples,)

Predicted class label for each test instance.

Examples

>>> from mlquantify.neighbors import PWK
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=200, random_state=42)
>>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y)
>>> q.classify(X[:5])
array([0, 1, 0, 1, 0])
fit(X, y)[source]#

Fit the PWK quantifier to the training data.

Builds a CC quantifier around the underlying weighted k-NN classifier and fits it on the provided data.

Parameters:
Xarray-like of shape (n_samples, n_features)

Training feature matrix.

yarray-like of shape (n_samples,)

Training class labels.

Returns:
selfPWK

The fitted quantifier.

Examples

>>> from mlquantify.neighbors import PWK
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=200, random_state=42)
>>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y)
get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)[source]#

Predict class prevalences for the given test data.

Classifies each test instance with the fitted weighted k-NN classifier and counts the fraction assigned to each class (Classify and Count).

Parameters:
Xarray-like of shape (n_samples, n_features)

Test feature matrix.

Returns:
prevalencesdict or ndarray of shape (n_classes,)

Estimated class prevalences summing to 1.

Examples

>>> from mlquantify.neighbors import PWK
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=200, random_state=42)
>>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y)
>>> q.predict(X)
{0: 0.49, 1: 0.51}
save_quantifier(path: str | None = None) None[source]#

Save the quantifier instance to a file.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.