Hands-On Artificial Intelligence for Beginners
上QQ阅读APP看书,第一时间看更新

Hyperparameter optimization

Aside from protecting against overfitting, we can optimize models by searching for the best combination of model hyperparameters. Hyperparameters are configuration variables that tell the model what methods to use, as opposed to model parameters which are learned during training - we'll learn more about these in upcoming chapter. They are programmatically added to a model, and are present in all modeling packages in Python. In the random forest model that we built precedingly, for instance, n_estimators is a hyperparameter that tells the model how many trees to build. The process of searching for the combination of hyperparameters that leads to the best model performance is called hyperparameter tuning

In Python, we can tune hyperparameter with an exhaustive search over their potential values, called a Grid Search. Let's use our random forest model to see how we can do this in Python by import GrisSearchCV:

from sklearn.model_selection import GridSearchCV

parameters = {
'n_estimators': [100, 500, 1000],
'max_features': [2, 3, 4],
'max_depth': [90, 100, 110, 120],
'min_samples_split': [6, 10, 14],
'min_samples_leaf': [2, 4, 6],
}

In this case, we are going to pass the Grid Search a few different hyperparameters to check; you can read about what they do in the documentation for the classifier (http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html). 

To create the search, we simply have to initialize it: 

search = GridSearchCV(estimator = rf_classifier, param_grid = parameters, cv = 3)

We can then apply it to the data: 

search.fit(x_train, y_train)
search.best_params_

If we then want to check the performance of the best combination of parameters, we can easily do that in sklearn by evaluating it on the test data: 

best = search.best_estimator_
accuracy = evaluate(best, x_test, y_test)

Hyperparameter tuning searches can be applied to the neural network models that we'll be utilizing in the coming chapters.