.. _losses:

.. currentmodule:: mlquantify.losses

==============
Loss Functions
==============

Loss functions are the objective minimised by the :ref:`solvers`. They measure
the discrepancy between a test representation and a candidate
prevalence-weighted mixture of the training representations. A quantifier family
is largely defined by the ``(representation, loss, solver)`` triple it composes,
and the same loss is reused across several methods.

Role and mechanism
==================

Given the per-class descriptors and a candidate prevalence :math:`p`, a loss
:math:`\mathcal{L}(p)` scores how well the mixture reproduces the test
descriptor; the solver returns the :math:`p` that minimises it. The factory
:func:`get_loss` builds any of the losses below by name.

.. list-table::
   :widths: 34 38 28
   :header-rows: 1

   * - Loss
     - Objective (in words)
     - Used by
   * - :class:`DistanceLoss`
     - Selectable distribution distance (Hellinger, Topsoe, prob-symm, ...).
     - DyS, HDy, HDx
   * - :class:`LeastSquaresLoss`
     - Squared error :math:`\lVert y - X p \rVert^2` of the linear system.
     - GACC, GPACC, FM, MMD_RKHS
   * - :class:`HellingerSurrogateLoss`
     - :math:`-\sum_i \sqrt{p_i q_i}`, a gradient-friendly squared-Hellinger surrogate.
     - GHDy, GHDx, KDEyHD
   * - :class:`EnergyLoss`
     - Energy-distance quadratic :math:`p^\top(2q - M p)`.
     - EDy, EDx
   * - :class:`NegativeLogLikelihoodLoss`
     - Negative log-likelihood of the mixture density.
     - EMQ, KDEyML, GKDEyML, MLPE
   * - :class:`MixtureNegativeLogLikelihoodLoss`, :class:`RegularizedMixtureNLLLoss`
     - Mixture NLL built from per-class likelihoods, optionally with a
       simplex-smoothness penalty.
     - likelihood-compose quantifiers

:class:`BaseLoss` defines the callable interface; custom losses subclass it.

Choosing a loss
===============

- **Distance / Hellinger-surrogate** losses drive the histogram and KDE matching
  methods; Topsoe is usually the best general distance for :class:`~mlquantify.matching.DyS`.
- **Least squares** implements the unified constrained-regression objective.
- **Energy** is the closed quadratic behind the energy-distance methods.
- **Negative log-likelihood** is the maximum-likelihood objective behind EM and
  KDE-ML quantifiers.

Used by
=======

The loss is the second element of the ``(representation, loss, solver)`` triple;
see :ref:`distribution_matching` and :ref:`likelihood`.

Example
=======

.. code-block:: python

   from mlquantify.losses import get_loss

   loss = get_loss("hellinger")
   value = loss([0.4, 0.6], [0.5, 0.5])

References
==========

.. dropdown:: References

   - González-Castro, V., Alaiz-Rodríguez, R., & Alegre, E. (2013). Class
     Distribution Estimation Based on the Hellinger Distance. *Information
     Sciences*, 218, 146–164.
   - Firat, A. (2016). Unified Framework for Quantification. *arXiv:1606.00868*.
   - Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the Outputs
     of a Classifier to New a Priori Probabilities. *Neural Computation*, 14(1),
     21–41.

.. seealso::

   :ref:`representations` and :ref:`solvers` for the other two elements of the
   matching triple.