GridSearchQ#
- class mlquantify.model_selection.GridSearchQ(quantifier, param_grid, protocol='app', samples_sizes=100, n_repetitions=10, scoring=<function MAE>, refit=True, val_split=0.4, n_jobs=1, random_seed=42, verbose=False)[source]#
Grid Search for Quantifiers with evaluation protocols.
This class automates the hyperparameter search over a grid of parameter combinations for a given quantifier. It evaluates each combination using a specified evaluation protocol (e.g., APP, NPP, UPP), over multiple splits of the validation data, and selects the best-performing parameters based on a chosen scoring metric such as Mean Absolute Error (MAE).
- Parameters:
- quantifierBaseQuantifier
Quantifier class (not instance). It must implement fit and predict.
- param_griddict
Dictionary where keys are parameter names and values are lists of parameter values to try.
- protocol{‘app’, ‘npp’, ‘upp’}, default=’app’
Evaluation protocol to use for splitting the validation data.
- samples_sizesint or list of int, default=100
Batch size(s) for evaluation splits.
- n_repetitionsint, default=10
Number of random repetitions per evaluation.
- scoringcallable, default=MAE
Scoring function to evaluate prevalence prediction quality. Must accept (true_prevalences, predicted_prevalences) arrays.
- refitbool, default=True
If True, refits the quantifier on the whole data using best parameters.
- val_splitfloat, default=0.4
Fraction of data reserved for validation during parameter search.
- n_jobsint or None, default=1
Number of parallel jobs for evaluation.
- random_seedint, default=42
Random seed for reproducibility.
- verbosebool, default=False
Enable verbose output during evaluation.
- Attributes:
- best_scorefloat
Best score (lowest loss) found during grid search.
best_paramsdictReturn the best parameters found during fitting.
- best_model_BaseQuantifier
Refitted quantifier instance with best parameters after search.
Methods
fit(X, y)
Runs grid search over param_grid, evaluates with the selected protocol, and stores best found parameters and model.
predict(X)
Predicts prevalences using the best fitted model after search.
best_params()
Returns the best parameter dictionary after fitting.
best_model()
Returns the best refitted quantifier after fitting.
sout(msg)
Utility method to print messages if verbose is enabled.
Examples
>>> from mlquantify.quantifiers import SomeQuantifier >>> param_grid = {'alpha': [0.1, 1.0], 'beta': [10, 20]} >>> grid_search = GridSearchQ(quantifier=SomeQuantifier, ... param_grid=param_grid, ... protocol='app', ... samples_sizes=100, ... n_repetitions=5, ... scoring=MAE, ... refit=True, ... val_split=0.3, ... n_jobs=2, ... random_seed=123, ... verbose=True) >>> grid_search.fit(X_train, y_train) >>> y_pred = grid_search.predict(X_test) >>> best_params = grid_search.best_params() >>> best_model = grid_search.best_model()
- best_model()[source]#
Return the best model after fitting.
- Returns:
- Quantifier
The best fitted model.
- Raises:
- ValueError
If called before fitting.
- best_params()[source]#
Return the best parameters found during fitting.
- Returns:
- dict
The best parameters.
- Raises:
- ValueError
If called before fitting.
- fit(X, y)[source]#
Fit quantifiers over grid parameter combinations with evaluation protocol.
Splits data into training and validation by val_split, and evaluates each parameter combination multiple times with protocol-generated batches.
- Parameters:
- Xarray-like
Feature matrix for training.
- yarray-like
Target labels for training.
- Returns:
- selfobject
Returns self for chaining.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)[source]#
Predict using the best found model.
- Parameters:
- Xarray-like
Data for prediction.
- Returns:
- predictionsarray-like
Prevalence predictions.
- Raises:
- RuntimeError
If called before fitting.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.