2.7. Nearest Neighbours#
Nearest-neighbour quantifiers estimate class prevalences by leveraging the local structure of the feature space. They are non-parametric, require no distributional assumptions, and are naturally robust to non-linear decision boundaries.
2.7.1. PWK — Probabilistic Weighted k-Nearest Neighbours#
PWK (Barranquero et al., 2013) wraps a k-NN classifier with a
class-imbalance-aware weighting scheme and then applies Classify and
Count (CC) on its predictions. Each neighbour’s vote is multiplied by a
class-specific weight that corrects for the size difference between classes,
so that the minority class is not drowned out by the majority.
The weight for class \(c\) is:
where \(M = \min_c m_c\) is the smallest class size and \(\alpha\) is the imbalance-correction exponent.
Special cases:
\(\alpha = 1\) → standard PWK (weights proportional to inverse class size).
\(\alpha \to \infty\) → all weights equal to 1 (standard k-NN).
Why it exists: A standard k-NN classifier biases predictions towards the majority class. PWK’s weighting neutralises this bias, producing more accurate CC estimates under class imbalance. Barranquero et al. (2013) showed PWK outperforms standard CC + k-NN by a wide margin on imbalanced datasets.
2.7.1.1. Parameters#
Parameter |
Default |
Explanation |
|---|---|---|
|
|
Imbalance-correction exponent. Higher values reduce the penalty on larger classes:
Tune by cross-validation if you are unsure; |
|
|
Number of neighbours \(k\). Larger \(k\) gives smoother (lower variance) prevalence estimates. Use larger values on bigger datasets. Common choices: 5, 10, 20. |
|
|
k-NN search algorithm. |
|
|
Distance metric for neighbour search. Use |
|
|
Leaf size for tree-based algorithms. Affects speed, not accuracy. |
|
|
Minkowski distance parameter. |
|
|
Extra keyword arguments for custom metric functions. |
|
|
Parallel jobs for neighbour search. |
2.7.1.2. Examples#
Basic usage:
from mlquantify.neighbors import PWK
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
X, y = make_classification(n_samples=1000, weights=[0.8, 0.2],
random_state=42)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42)
q = PWK(alpha=1, n_neighbors=10)
q.fit(X_train, y_train)
print(q.predict(X_test))
# {0: 0.80, 1: 0.20}
Tuning alpha and k:
from mlquantify.model_selection import GridSearchQ, APP
from mlquantify.metrics import MAE
from mlquantify.neighbors import PWK
protocol = APP(batch_size=100, n_prevalences=21, repeats=5)
gs = GridSearchQ(
quantifier=PWK(),
param_grid={
'alpha': [1, 2, 5],
'n_neighbors': [5, 10, 20],
},
protocol=protocol,
error=MAE,
)
gs.fit(X_train, y_train)
print(gs.best_params_)
Getting per-instance classifications:
q = PWK(alpha=1, n_neighbors=10)
q.fit(X_train, y_train)
labels = q.classify(X_test) # hard labels from the weighted k-NN
print(labels[:10])
2.7.1.3. When to Use PWK#
When you want a simple, non-parametric quantifier without tuning a full probabilistic classifier.
When feature space distances are meaningful (normalised continuous features).
For imbalanced binary problems where a standard classifier would be biased.
Limitation: PWK applies CC on top of the k-NN classifier; it inherits CC’s bias under distributional shift. It does not apply any adjustment like ACC or EMQ. For strong shift correction, use EMQ or DyS instead.
2.7.2. References#
References
Barranquero, J., Díez, J., & del Coz, J. J. (2013). On the study of nearest neighbor algorithms for prevalence estimation in binary problems. Pattern Recognition, 46(2), 472–482.
See also
Counting-Based Quantifiers for CC (which PWK builds on top of). Likelihood-Based Quantification for EMQ (stronger shift correction).