Correct statement on number of trees for GBDT #793

fritshermans · 2024-12-24T13:32:43Z

Exercise M6.03 states:

"Both gradient boosting and random forest models improve when increasing the number of trees in the ensemble. However, the scores reach a plateau where adding new trees just makes fitting and scoring slower."

While this statement holds true for random forests, it does not apply to gradient boosting decision trees. In GBDT, adding too many estimators can lead to overfitting. This can be demonstrated by creating a validation curve for GBDT, similar to what is done for random forests (see attached image). The result actually supports the need for hyperparameter tuning or applying early stopping, as shown at the end of exercise M6.03.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct statement on number of trees for GBDT #793

Correct statement on number of trees for GBDT #793

fritshermans commented Dec 24, 2024

Correct statement on number of trees for GBDT #793

Correct statement on number of trees for GBDT #793

Comments

fritshermans commented Dec 24, 2024