13.1.2. Loss Functions#

Loss functions are the objective minimised by the Solvers. They measure the discrepancy between a test representation and a candidate prevalence-weighted mixture of the training representations. A quantifier family is largely defined by the (representation, loss, solver) triple it composes, and the same loss is reused across several methods.

13.1.2.1. Role and mechanism#

Given the per-class descriptors and a candidate prevalence \(p\), a loss \(\mathcal{L}(p)\) scores how well the mixture reproduces the test descriptor; the solver returns the \(p\) that minimises it. The factory get_loss builds any of the losses below by name.

Loss

Objective (in words)

Used by

DistanceLoss

Selectable distribution distance (Hellinger, Topsoe, prob-symm, …).

DyS, HDy, HDx

LeastSquaresLoss

Squared error \(\lVert y - X p \rVert^2\) of the linear system.

GACC, GPACC, FM, MMD_RKHS

HellingerSurrogateLoss

\(-\sum_i \sqrt{p_i q_i}\), a gradient-friendly squared-Hellinger surrogate.

GHDy, GHDx, KDEyHD

EnergyLoss

Energy-distance quadratic \(p^\top(2q - M p)\).

EDy, EDx

NegativeLogLikelihoodLoss

Negative log-likelihood of the mixture density.

EMQ, KDEyML, GKDEyML, MLPE

MixtureNegativeLogLikelihoodLoss, RegularizedMixtureNLLLoss

Mixture NLL built from per-class likelihoods, optionally with a simplex-smoothness penalty.

likelihood-compose quantifiers

BaseLoss defines the callable interface; custom losses subclass it.

13.1.2.2. Choosing a loss#

  • Distance / Hellinger-surrogate losses drive the histogram and KDE matching methods; Topsoe is usually the best general distance for DyS.

  • Least squares implements the unified constrained-regression objective.

  • Energy is the closed quadratic behind the energy-distance methods.

  • Negative log-likelihood is the maximum-likelihood objective behind EM and KDE-ML quantifiers.

13.1.2.3. Used by#

The loss is the second element of the (representation, loss, solver) triple; see Distribution Matching and Likelihood-Based Quantification.

13.1.2.4. Example#

from mlquantify.losses import get_loss

loss = get_loss("hellinger")
value = loss([0.4, 0.6], [0.5, 0.5])

13.1.2.5. References#

References
  • González-Castro, V., Alaiz-Rodríguez, R., & Alegre, E. (2013). Class Distribution Estimation Based on the Hellinger Distance. Information Sciences, 218, 146–164.

  • Firat, A. (2016). Unified Framework for Quantification. arXiv:1606.00868.

  • Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the Outputs of a Classifier to New a Priori Probabilities. Neural Computation, 14(1), 21–41.

See also

Representations and Solvers for the other two elements of the matching triple.