hyperparameter optimization using HyperOpt

There are two ways of optimization of hyperparameters in AI4Water. The ai4water.hyperopt.HyperOpt class is the lower level api while Model.optimize_hyperparameters() is the higher level api. For using HyperOpt class, the user has to define the objecive function and hyerparameter space explicitly. Morevoer, the user has to instantiate the HyperOpt class and call the fit method on it.

This example shows, how to use HyperOpt class for optimization of hyperparameters.

import os
import math

import numpy as np

from skopt.plots import plot_objective
from SeqMetrics import RegressionMetrics

from ai4water.functional import Model
from ai4water.datasets import busan_beach
from ai4water.utils.utils import get_version_info
from ai4water.utils.utils import jsonize, dateandtime_now
from ai4water.hyperopt import HyperOpt, Categorical, Real, Integer

# sphinx_gallery_thumbnail_number = 2

for k,v in get_version_info().items():
    print(f"{k} version: {v}")
python version: 3.7.9 (default, Oct 19 2020, 15:13:17)
[GCC 7.5.0]
os version: posix
ai4water version: 1.06
xgboost version: 1.6.2
easy_mpl version: 0.21.2
SeqMetrics version: 1.3.4
tensorflow version: 2.7.0
keras.api._v2.keras version: 2.7.0
numpy version: 1.21.6
pandas version: 1.3.5
matplotlib version: 3.5.3
h5py version: 3.7.0
joblib version: 1.2.0
data = busan_beach()

SEP = os.sep
PREFIX = f"hpo_{dateandtime_now()}"
ITER = 0
# Optimizing the hyperparameters usually involves four steps

1) define objective function

def objective_fn(
        prefix=None,
        **suggestions)->float:
    """This function must build, train and evaluate the ML model.
    The output of this function will be minimized by optimization algorithm.
    """
    suggestions = jsonize(suggestions)
    global ITER

    # build model
    _model = Model(model={"XGBRegressor": suggestions},
                  prefix=prefix or PREFIX,
                  train_fraction=1.0,
                  split_random=True,
                  verbosity=0,
                  )

    # train model
    _model.fit(data=data)

    # evaluate model
    t, p = _model.predict(data='validation', return_true=True, process_results=False)
    val_score = RegressionMetrics(t, p).r2_score()

    if not math.isfinite(val_score):
        val_score = 1.0

    # since the optimization algorithm solves minimization algorithm
    # we have to subtract r2_score from 1.0
    # if our validation metric is something like mse or rmse,
    # then we don't need to subtract it from 1.0
    val_score = 1.0 - val_score

    ITER += 1

    print(f"{ITER} {val_score}")

    return val_score

2) define parameter space

the parameter space determines the pool of candidates from which hyperparameters will be choosen during optimization

num_samples=10
space = [
Integer(low=5, high=50, name='n_estimators', num_samples=num_samples),
# Maximum tree depth for base learners
Integer(low=3, high=10, name='max_depth', num_samples=num_samples),
Real(low=0.01, high=0.5, name='learning_rate', prior='log', num_samples=num_samples),
Categorical(categories=['gbtree', 'gblinear', 'dart'], name='booster'),
]

3) initial state

this step is optional but it is always better to provide a good initial guess to the optimization algorithm

x0 = [5, 4, 0.1, "gbtree"]

4) run optimization algorithm

# Now instantiate the HyperOpt class and call .fit on it
# algorithm can be either ``random``, ``grid``, ``bayes``, ``tpe``, ``bayes_rf``
#

optimizer = HyperOpt(
    algorithm="bayes",
    objective_fn=objective_fn,
    param_space=space,
    x0=x0,
    num_iterations=25,
    process_results=False,
    opt_path=f"results{SEP}{PREFIX}",
    verbosity=0,
)

results = optimizer.fit()
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
1 3.3545373739474798
[01:53:30] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
2 2.0520282748400933
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
3 85.34741701261791
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
4 57.93200291381492
[01:53:31] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
5 1.1031883493609647
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
6 80.5761592501049
[01:53:32] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
7 2.0432265000454426
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
8 0.8490724296302107
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
9 26.434907636320624
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
10 1.29135540429111
[01:53:33] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
11 1.6868754195571543
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
12 0.8407231971333186
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
13 1.1143701900672454
[01:53:37] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
14 1.2753129265425647
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
15 0.8407073086853296
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
16 0.8273212958234943
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
17 3.980753838627524
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
18 0.9232753877513347
[01:53:42] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
19 1.2168845498351057
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
20 0.8195065206713756
[01:53:44] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
21 1.1730027482061778
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
22 0.817172841842378
/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
23 0.8421553091640155
[01:53:48] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
24 1.0660594830784598
[01:53:49] WARNING: ../src/learner.cc:627:
Parameters: { "max_depth" } might not be used.

  This could be a false alarm, with some parameters getting used by language bindings but
  then being mistakenly passed down to XGBoost core, or some parameter actually being used
  but getting flagged wrongly here. Please open an issue if you find any such cases.


/home/docs/checkouts/readthedocs.org/user_builds/hyperopt-examples/envs/latest/lib/python3.7/site-packages/ai4water/_main.py:1978: UserWarning:
            argument validation is deprecated and will be removed in future. Please
            use 'predict_on_validation_data' method instead.
  use 'predict_on_{data}_data' method instead.""")
25 1.0519174212020213
print(f"optimized parameters are \n{optimizer.best_paras()}")

print(np.min(optimizer.func_vals()))
optimized parameters are
{'n_estimators': 7, 'max_depth': 3, 'learning_rate': 0.02182676683717076, 'booster': 'dart'}
0.817172841842378

postprocessing of results

save hyperparameters at each iteration

optimizer.save_iterations_as_xy()

save convergence plot

optimizer._plot_convergence(save=False)
Convergence plot
<AxesSubplot:title={'center':'Convergence plot'}, xlabel='Number of calls $n$', ylabel='$\\min f(x)$ after $n$ calls'>
optimizer._plot_parallel_coords(figsize=(14, 8), save=False)
Parallel Coordinates Plot
optimizer._plot_distributions(save=False)
hpo lower
<Figure size 1000x1000 with 4 Axes>
optimizer.plot_importance(save=False)
hpo lower
<AxesSubplot:ylabel='Relative Importance'>
_ = plot_objective(results)
hpo lower
optimizer._plot_evaluations(save=False)
hpo lower
optimizer._plot_edf(save=False)
Empirical Distribution Function Plot
# Above, If you set ``process_results`` to True, all of the results are automatically
# saved in the optimization directory.


print(f"All the results are save in {optimizer.opt_path} directory")
All the results are save in results/hpo_20230105_015329 directory

Total running time of the script: ( 0 minutes 30.211 seconds)

Gallery generated by Sphinx-Gallery