Examples#

A gallery of short, self-contained scripts that illustrate how to use mlquantify in practice. Every example renders its full source code above the figure it produces — click the copy button in the top-right corner of a code block to run it as-is.

The examples are grouped by theme. If you are new to quantification, start with Introduction to quantification and Why counting fails under prior shift.

Quantification basics#

Introduction to quantification

Fit your first quantifier, predict class prevalence on a test sample, and compare it against the ground truth.

Introduction to quantification

Why counting fails under prior shift

The motivating example for the whole field: plain Classify & Count is biased when the test prevalence drifts away from training.

Why counting fails under prior shift

Comparing methods#

Comparing quantifiers with diagonal plots

Run several quantifiers through an Artificial Prevalence Protocol and read off their bias and variance from true-vs-predicted diagonal plots.

Comparing quantifiers with diagonal plots

Distribution matching, step by step

Visualise the mechanism behind HDy and DyS: match the test score histogram with a mixture of the class-conditional histograms.

Distribution matching, step by step

EMQ and the EM prior correction

Watch the Expectation-Maximisation loop iteratively correct the prior until the adjusted prevalence converges to the true one.

EMQ and the EM prior correction

Evaluation and protocols#

Evaluation protocols (APP, NPP, UPP)

See the different test distributions that each sampling protocol produces, and when to reach for each one.

Evaluation protocols (APP, NPP, UPP)

Robustness to prior-probability shift

Plot quantification error as a function of how far the test prevalence drifts from training, and compare methods on robustness.

Robustness to prior-probability shift

Model selection and uncertainty#

Tuning a quantifier with GridSearchQ

Select hyper-parameters with a quantification-aware grid search driven by an evaluation protocol.

Tuning a quantifier with GridSearchQ

Confidence intervals and regions

Turn a point prevalence estimate into a bootstrap confidence interval and, for three classes, a confidence ellipse on the simplex.

Confidence intervals and regions

Calibrating classifier posteriors

Fix an over-confident classifier with temperature / vector scaling and watch the reliability diagram snap onto the diagonal — the posterior fix that EMQ relies on.

Calibrating classifier posteriors

Multiclass#

Multiclass quantification

Quantify a three-class problem and inspect the per-class accuracy and the joint uncertainty on the probability simplex.

Multiclass quantification

Real-world datasets#

Load well-known quantification benchmarks with the datasets fetchers, then score quantifiers on them.

Loading real-world datasets

Fetch and cache benchmark datasets as a Bunch or an (X, y) tuple, as NumPy arrays or pandas frames.

Loading real-world datasets

Evaluating a quantifier on real data

Use a fetcher’s built-in protocol to draw test bags with known prevalence and score a quantifier against them.

Evaluating a quantifier on real data

Synthetic datasets with `make_quantification`#

Build controlled experiments with make_quantification: generate labelled bags under prior-probability shift, see how the data and its prevalence behave, then benchmark quantifiers on it with the true prevalences in hand.

Visualizing synthetic data

Plot a single bag in two dimensions, coloured by class, to see what the generator produces.

Visualizing synthetic quantification data

Prior shift, bag by bag

Watch the class clusters keep their shape while their balance shifts from bag to bag.

Prior shift, bag by bag

Controlling prevalence variability

Compare the prevalence strategies — uniform, grid, natural and Dirichlet — and the concentration knob.

Controlling prevalence variability across bags

Class separability and label noise

Tune how hard the problem is with class_sep and flip_y, and see the effect on quantification error.

Class separability and label noise

Covariate and concept shift

Generate the other two kinds of dataset shift, see how each looks, and why the best quantifier depends on the shift.

Covariate and concept shift

Benchmarking quantifiers

Fit on a training sample, predict many shifted bags, and score methods directly against the known prevalences.

Benchmarking quantifiers on synthetic bags