PWK#
- class mlquantify.neighbors.PWK(alpha=1, n_neighbors=10, algorithm='auto', metric='euclidean', leaf_size=30, p=2, metric_params=None, n_jobs=None)[source]#
Probabilistic Weighted k-Nearest Neighbour (PWK) quantifier.
Estimates class prevalences using a k-nearest neighbour classifier with class-imbalance-aware weighting. Each neighbour’s contribution is scaled by a class-specific weight that corrects for the imbalance between class sizes, controlled by the
alphaexponent.- Parameters:
- alphafloat, default=1
Imbalance correction exponent. Higher values amplify the influence of minority-class neighbours.
- n_neighborsint, default=10
Number of nearest neighbours considered for each test instance.
- algorithm{‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’}, default=’auto’
Algorithm used to find nearest neighbours.
- metricstr, default=’euclidean’
Distance metric for the neighbour search.
- leaf_sizeint, default=30
Leaf size for tree-based algorithms.
- pint, default=2
Power parameter for the Minkowski metric.
- metric_paramsdict or None, default=None
Additional keyword arguments for the metric function.
- n_jobsint or None, default=None
Number of parallel jobs for the neighbour search.
- Attributes:
- estimatorPWKCLF
Fitted underlying weighted k-NN classifier.
- classes_ndarray of shape (n_classes,)
Class labels seen during
fit.
References
References
[1]Barranquero, J., Díez, J., & del Coz, J. J. (2013). Quantification-Oriented Learning Based on Reliable Classifiers. Pattern Recognition, 48(2), 591–604.
Examples
>>> from mlquantify.neighbors import PWK >>> from sklearn.datasets import make_classification >>> X, y = make_classification(n_samples=200, random_state=42) >>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y) >>> q.predict(X) {0: 0.49, 1: 0.51}
- classify(X)[source]#
Classify test instances using the underlying weighted k-NN estimator.
Returns hard class labels produced by
PWKCLFwithout any prevalence-level aggregation.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test feature matrix.
- Returns:
- labelsndarray of shape (n_samples,)
Predicted class label for each test instance.
Examples
>>> from mlquantify.neighbors import PWK >>> from sklearn.datasets import make_classification >>> X, y = make_classification(n_samples=200, random_state=42) >>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y) >>> q.classify(X[:5]) array([0, 1, 0, 1, 0])
- fit(X, y)[source]#
Fit the PWK quantifier to the training data.
Builds a
CCquantifier around the underlying weighted k-NN classifier and fits it on the provided data.- Parameters:
- Xarray-like of shape (n_samples, n_features)
Training feature matrix.
- yarray-like of shape (n_samples,)
Training class labels.
- Returns:
- selfPWK
The fitted quantifier.
Examples
>>> from mlquantify.neighbors import PWK >>> from sklearn.datasets import make_classification >>> X, y = make_classification(n_samples=200, random_state=42) >>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y)
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)[source]#
Predict class prevalences for the given test data.
Classifies each test instance with the fitted weighted k-NN classifier and counts the fraction assigned to each class (Classify and Count).
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Test feature matrix.
- Returns:
- prevalencesdict or ndarray of shape (n_classes,)
Estimated class prevalences summing to 1.
Examples
>>> from mlquantify.neighbors import PWK >>> from sklearn.datasets import make_classification >>> X, y = make_classification(n_samples=200, random_state=42) >>> q = PWK(alpha=1.5, n_neighbors=5).fit(X, y) >>> q.predict(X) {0: 0.49, 1: 0.51}
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.