Examples#
A gallery of short, self-contained scripts that illustrate how to use
mlquantify in practice. Every example renders its full source code above the
figure it produces — click the copy button in the top-right corner of a code
block to run it as-is.
The examples are grouped by theme. If you are new to quantification, start with Introduction to quantification and Why counting fails under prior shift.
Quantification basics#
Fit your first quantifier, predict class prevalence on a test sample, and compare it against the ground truth.
The motivating example for the whole field: plain Classify & Count is biased when the test prevalence drifts away from training.
Comparing methods#
Run several quantifiers through an Artificial Prevalence Protocol and read off their bias and variance from true-vs-predicted diagonal plots.
Visualise the mechanism behind HDy and DyS: match the test score histogram with a mixture of the class-conditional histograms.
Watch the Expectation-Maximisation loop iteratively correct the prior until the adjusted prevalence converges to the true one.
Evaluation and protocols#
See the different test distributions that each sampling protocol produces, and when to reach for each one.
Plot quantification error as a function of how far the test prevalence drifts from training, and compare methods on robustness.
Model selection and uncertainty#
Select hyper-parameters with a quantification-aware grid search driven by an evaluation protocol.
Turn a point prevalence estimate into a bootstrap confidence interval and, for three classes, a confidence ellipse on the simplex.
Fix an over-confident classifier with temperature / vector scaling and watch the reliability diagram snap onto the diagonal — the posterior fix that EMQ relies on.
Multiclass#
Quantify a three-class problem and inspect the per-class accuracy and the joint uncertainty on the probability simplex.
Real-world datasets#
Load well-known quantification benchmarks with the
datasets fetchers, then score quantifiers on them.
Fetch and cache benchmark datasets as a Bunch or an (X, y) tuple,
as NumPy arrays or pandas frames.
Use a fetcher’s built-in protocol to draw test bags with known prevalence and score a quantifier against them.
Synthetic datasets with make_quantification#
Build controlled experiments with
make_quantification: generate labelled bags under
prior-probability shift, see how the data and its prevalence behave, then
benchmark quantifiers on it with the true prevalences in hand.
Plot a single bag in two dimensions, coloured by class, to see what the generator produces.
Watch the class clusters keep their shape while their balance shifts from bag to bag.
Compare the prevalence strategies — uniform, grid, natural and
Dirichlet — and the concentration knob.
Tune how hard the problem is with class_sep and flip_y, and see
the effect on quantification error.
Generate the other two kinds of dataset shift, see how each looks, and why the best quantifier depends on the shift.
Fit on a training sample, predict many shifted bags, and score methods directly against the known prevalences.