GridSearchQ#

class mlquantify.model_selection.GridSearchQ(quantifier, param_grid, protocol='app', samples_sizes=100, n_repetitions=10, scoring=<function MAE>, refit=True, val_split=0.4, n_jobs=1, random_seed=42, verbose=False)[source]#

Grid Search for Quantifiers with evaluation protocols.

This class automates the hyperparameter search over a grid of parameter combinations for a given quantifier. It evaluates each combination using a specified evaluation protocol (e.g., APP, NPP, UPP), over multiple splits of the validation data, and selects the best-performing parameters based on a chosen scoring metric such as Mean Absolute Error (MAE).

Parameters:
quantifierBaseQuantifier

Quantifier class (not instance). It must implement fit and predict.

param_griddict

Dictionary where keys are parameter names and values are lists of parameter values to try.

protocol{‘app’, ‘npp’, ‘upp’}, default=’app’

Evaluation protocol to use for splitting the validation data.

samples_sizesint or list of int, default=100

Batch size(s) for evaluation splits.

n_repetitionsint, default=10

Number of random repetitions per evaluation.

scoringcallable, default=MAE

Scoring function to evaluate prevalence prediction quality. Must accept (true_prevalences, predicted_prevalences) arrays.

refitbool, default=True

If True, refits the quantifier on the whole data using best parameters.

val_splitfloat, default=0.4

Fraction of data reserved for validation during parameter search.

n_jobsint or None, default=1

Number of parallel jobs for evaluation.

random_seedint, default=42

Random seed for reproducibility.

verbosebool, default=False

Enable verbose output during evaluation.

Attributes:
best_scorefloat

Best score (lowest loss) found during grid search.

best_paramsdict

Return the best parameters found during fitting.

best_model_BaseQuantifier

Refitted quantifier instance with best parameters after search.

Methods

fit(X, y)

Runs grid search over param_grid, evaluates with the selected protocol, and stores best found parameters and model.

predict(X)

Predicts prevalences using the best fitted model after search.

best_params()

Returns the best parameter dictionary after fitting.

best_model()

Returns the best refitted quantifier after fitting.

sout(msg)

Utility method to print messages if verbose is enabled.

Examples

>>> from mlquantify.quantifiers import SomeQuantifier
>>> param_grid = {'alpha': [0.1, 1.0], 'beta': [10, 20]}
>>> grid_search = GridSearchQ(quantifier=SomeQuantifier,
...                          param_grid=param_grid,
...                          protocol='app',
...                          samples_sizes=100,
...                          n_repetitions=5,
...                          scoring=MAE,
...                          refit=True,
...                          val_split=0.3,
...                          n_jobs=2,
...                          random_seed=123,
...                          verbose=True)
>>> grid_search.fit(X_train, y_train)
>>> y_pred = grid_search.predict(X_test)
>>> best_params = grid_search.best_params()
>>> best_model = grid_search.best_model()
best_model()[source]#

Return the best model after fitting.

Returns:
Quantifier

The best fitted model.

Raises:
ValueError

If called before fitting.

best_params()[source]#

Return the best parameters found during fitting.

Returns:
dict

The best parameters.

Raises:
ValueError

If called before fitting.

fit(X, y)[source]#

Fit quantifiers over grid parameter combinations with evaluation protocol.

Splits data into training and validation by val_split, and evaluates each parameter combination multiple times with protocol-generated batches.

Parameters:
Xarray-like

Feature matrix for training.

yarray-like

Target labels for training.

Returns:
selfobject

Returns self for chaining.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

predict(X)[source]#

Predict using the best found model.

Parameters:
Xarray-like

Data for prediction.

Returns:
predictionsarray-like

Prevalence predictions.

Raises:
RuntimeError

If called before fitting.

save_quantifier(path: str | None = None) None[source]#

Save the quantifier instance to a file.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

sout(msg)[source]#

Prints messages if verbose is True.