NPP#

class mlquantify.evaluation.protocol.NPP(models: List[str | Quantifier] | str | Quantifier, learner: BaseEstimator | None = None, n_jobs: int = 1, random_state: int = 32, verbose: bool = False, return_type: str = 'predictions', measures: List[str] | None = None)[source]#

Natural Prevalence Protocol.

This approach splits a test into several samples varying sample size, with n iterations. For a list of Quantifiers, it computes training and testing for each one and returns either a table of results with error measures or just the predictions.

Parameters:
modelsUnion[List[Union[str, Quantifier]], str, Quantifier]

List of quantification models, a single model name, or ‘all’ for all models.

batch_sizeUnion[List[int], int]

Size of the batches to be processed, or a list of sizes.

learnerBaseEstimator, optional

Machine learning model to be used with the quantifiers. Required for model methods.

n_iterationsint, optional

Number of iterations for the protocol. Default is 1.

n_jobsint, optional

Number of jobs to run in parallel. Default is 1.

random_stateint, optional

Seed for random number generation. Default is 32.

verbosebool, optional

Whether to print progress messages. Default is False.

return_typestr, optional

Type of return value (‘predictions’ or ‘table’). Default is ‘predictions’.

measuresList[str], optional

List of error measures to calculate. Must be in MEASURES or None. Default is None.

Attributes:
modelsList[Quantifier]

List of quantification models.

batch_sizeUnion[List[int], int]

Size of the batches to be processed.

learnerBaseEstimator

Machine learning model to be used with the quantifiers.

n_iterationsint

Number of iterations for the protocol.

n_jobsint

Number of jobs to run in parallel.

random_stateint

Seed for random number generation.

verbosebool

Whether to print progress messages.

return_typestr

Type of return value (‘predictions’ or ‘table’).

measuresList[str]

List of error measures to calculate.

fit(X_train, y_train)[source]#

Fits the models with the training data.

Parameters:
X_trainnp.ndarray

Features of the training set.

y_trainnp.ndarray

Labels of the training set.

Returns:
Protocol

Fitted protocol.

predict(X_test: ndarray, y_test: ndarray) Any[source]#

Predicts the prevalence for the test set.

Parameters:
X_testnp.ndarray

Features of the test set.

y_testnp.ndarray

Labels of the test set.

Returns:
Any

Predictions for the test set. Can be a table or a tuple with the quantifier names, real prevalence, and predicted prevalence.

predict_protocol(X_test, y_test) tuple[source]#

Abstract method that every protocol must implement

Parameters:
X_testnp.ndarray

Features of the test set.

y_testnp.ndarray

Labels of the test set.

Returns:
np.ndarray

Predictions for the test set. With the same format as the column names attribute.

sout(msg)[source]#

Prints a message if verbose is True.