3.3. QuaDapt: Drift-Resilient Score Adaptation#
The QuaDapt method is a drift-resilient meta-quantification strategy designed to
handle situations where the classifier’s score distributions drift between
training and test domains.
Instead of assuming that the original score distributions remain stable, QuaDapt
actively adapts them at prediction time [1].
QuaDapt: adaptive score simulation and distribution matching. Ortega et al., 2025#
QuaDapt is inspired by the principle behind Distribution Matching (DM) or
Mixture Model (MM) quantifiers (see Mixture Models), which estimate test prevalences by finding the
mixture of class-conditional distributions that best fits the test data.
However, while classical MM methods rely on empirical training distributions,
QuaDapt replaces them with synthetic score distributions generated for multiple
hypothetical levels of class separability [1].
The method evaluates several merging factors \(m\) that control synthetic
separability and selects the one producing the closest match to the test-score
distribution.
The chosen synthetic model is then passed to an aggregative quantifier (e.g., ACC,
T50, DyS, SORD.) to compute final prevalence estimates.
This makes QuaDapt a meta-quantifier capable of adapting to score drift without relying on static training distributions.
MoSS: Synthetic Score Generator Used by QuaDapt
QuaDapt relies on MoSS (Model for Score Simulation) to generate synthetic positive and negative score distributions with controlled overlap [2].
MoSS takes three parameters:
\(n\): number of synthetic samples
\(\alpha\): class proportion
\(m\): merging factor controlling the overlap
Positive and negative scores are generated as:
Larger values of \(m\) create more overlapping distributions, modeling weaker classifier separability or drift.
Example Usage
from mlquantify.meta import QuaDapt
from mlquantify.adjust_counting import ACC
from sklearn.ensemble import RandomForestClassifier
q = QuaDapt(
quantifier=ACC(RandomForestClassifier()),
merging_factors=[0.1, 0.5, 1.0],
measure="topsoe"
)
q.fit(X_train, y_train)
prevalence = q.predict(X_test)