13.1.2. Loss Functions#

Loss functions are the objective minimised by the Solvers. They measure the discrepancy between a test representation and a candidate prevalence-weighted mixture of the training representations. A quantifier family is largely defined by the (representation, loss, solver) triple it composes, and the same loss is reused across several methods.

13.1.2.1. Role and mechanism#

Given the per-class descriptors and a candidate prevalence \(p\), a loss \(\mathcal{L}(p)\) scores how well the mixture reproduces the test descriptor; the solver returns the \(p\) that minimises it. The factory get_loss builds any of the losses below by name.

Loss	Objective (in words)	Used by
`DistanceLoss`	Selectable distribution distance (Hellinger, Topsoe, prob-symm, …).	DyS, HDy, HDx
`LeastSquaresLoss`	Squared error \(\lVert y - X p \rVert^2\) of the linear system.	GACC, GPACC, FM, MMD_RKHS
`HellingerSurrogateLoss`	\(-\sum_i \sqrt{p_i q_i}\), a gradient-friendly squared-Hellinger surrogate.	GHDy, GHDx, KDEyHD
`EnergyLoss`	Energy-distance quadratic \(p^\top(2q - M p)\).	EDy, EDx
`NegativeLogLikelihoodLoss`	Negative log-likelihood of the mixture density.	EMQ, KDEyML, GKDEyML, MLPE
`MixtureNegativeLogLikelihoodLoss`, `RegularizedMixtureNLLLoss`	Mixture NLL built from per-class likelihoods, optionally with a simplex-smoothness penalty.	likelihood-compose quantifiers

BaseLoss defines the callable interface; custom losses subclass it.

13.1.2.2. Choosing a loss#

Distance / Hellinger-surrogate losses drive the histogram and KDE matching methods; Topsoe is usually the best general distance for DyS.
Least squares implements the unified constrained-regression objective.
Energy is the closed quadratic behind the energy-distance methods.
Negative log-likelihood is the maximum-likelihood objective behind EM and KDE-ML quantifiers.

13.1.2.3. Used by#

The loss is the second element of the (representation, loss, solver) triple; see Distribution Matching and Likelihood-Based Quantification.

13.1.2.4. Example#

from mlquantify.losses import get_loss

loss = get_loss("hellinger")
value = loss([0.4, 0.6], [0.5, 0.5])

13.1.2.5. References#

See also

Representations and Solvers for the other two elements of the matching triple.