-
Notifications
You must be signed in to change notification settings - Fork 4k
[python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set #6857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
197a3bd to
5dd3171
Compare
jameslamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It's looking like you're struggling to get this passing CI, so I'm going to put into "draft" for now. @ me any time here if you need help with development, and we can open this back up for review once CI is passing.
I saw you had multiple commits responding to linting errors... here's how to run those locally for faster feedback:
# (or conda, whatever you want)
pip install pre-commit
pre-commit run --all-filesAnd here's how to build locally and run the tests:
# step 1: compile lib_ligihgbtm
# (only need to do this once, because you're not making any C/C++ changes)
cmake -B build -S .
cmake --build build --target _lightgbm -j4
# step 2: install the Python package, re-using it
# (do this every time you change Python code in the library)
sh build-python.sh install --precompile
# step 3: run the scikit-learn tests
pytest tests/python_package_test/test_sklearn.py|
@jameslamb Thanks for your suggestions. |
Making both options available for a time and raising a deprecation warning when I'm sorry but I cannot invest much time in this right now (for example, looking into whether this would introduce inconsistencies with
No, please. As I said in scikit-learn/scikit-learn#28901 (comment), removing |
|
@jameslamb I'm sorry, I really need a maintainer's help. The tests in Details |
|
@jameslamb Thank you so much. Pinning |
|
The remaining CI failures seem unrelated. |
@lorentzenchr if you are interested in continuing this I'd be happy to help with reviews. I'm supportive of adding this, for better compatibility with newer versions of |
|
@jameslamb Yes, I‘d like to finish this. Your review would be great. Anything from me you need before you can start reviewing? |
|
Great! I'd been waiting to review until you were done adding whatever tests you wanted. If you'd like a review before then, update this to latest |
| # train | ||
| gbm = lgb.LGBMRegressor(num_leaves=31, learning_rate=0.05, n_estimators=20) | ||
| gbm.fit(X_train, y_train, eval_set=[(X_test, y_test)], eval_metric="l1", callbacks=[lgb.early_stopping(5)]) | ||
| gbm.fit(X_train, y_train, eval_X=(X_test,), eval_y=(y_test,), eval_metric="l1", callbacks=[lgb.early_stopping(5)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
If we're going to consider eval_set deprecated and eval_{X,y} the new recommended pattern, l think we should nudge users towards that by updating all of the documentation. I've done that here (examples/ is the only place with such code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think eval_X=X_test, eval_y=y_test without wrapping into a tuple is fine, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, either will work. I have a very weak preference for the tuple form in these docs, to make it a little clearer that providing multiple validation sets is supported.
| if eval_set is None and eval_X is not None: | ||
| if isinstance(eval_X, tuple) != isinstance(eval_y, tuple): | ||
| raise ValueError("If eval_X is a tuple, y_val must be a tuple of same length, and vice versa.") | ||
| if isinstance(eval_X, tuple) and isinstance(eval_y, tuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
This and isinstance(eval_y, tuple) seems redundant given all the previous conditions, but mypy needs it to understand that eval_y is not None at this point. Otherwise, it reports these new errors:
sklearn.py:515: error: Argument 1 to "len" has incompatible type "list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any | tuple[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any] | None"; expected "Sized" [arg-type]
sklearn.py:518: error: Argument 2 to "zip" has incompatible type "list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any | tuple[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any] | None"; expected "Iterable[Any]" [arg-type]
sklearn.py:518: error: Argument 2 to "zip" has incompatible type "list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any | tuple[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any] | None"; expected "Iterable[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any]" [arg-type]
| if isinstance(eval_set, tuple): | ||
| return [eval_set] | ||
| else: | ||
| return eval_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
Although providing something like eval_set=(X_valid, y_valid) conflicts with the type hints and docs:
LightGBM/python-package/lightgbm/sklearn.py
Line 914 in 544d439
| eval_set: Optional[List[_LGBM_ScikitValidSet]] = None, |
... it has been supported in lightgbm for a long time:
LightGBM/python-package/lightgbm/sklearn.py
Lines 990 to 992 in 544d439
| if eval_set is not None: | |
| if isinstance(eval_set, tuple): | |
| eval_set = [eval_set] |
Adding this line preserves that behavior. Changing the existing behavior when eval_set is provided is outside the scope of this PR (other than raising deprecation warnings).
| if eval_set is not None: | ||
| if eval_group is None: | ||
| raise ValueError("Eval_group cannot be None when eval_set is not None") | ||
| if len(eval_group) != len(eval_set): | ||
| raise ValueError("Length of eval_group should be equal to eval_set") | ||
| if ( | ||
| isinstance(eval_group, dict) | ||
| and any(i not in eval_group or eval_group[i] is None for i in range(len(eval_group))) | ||
| or isinstance(eval_group, list) | ||
| and any(group is None for group in eval_group) | ||
| ): | ||
| raise ValueError( | ||
| "Should set group for all eval datasets for ranking task; " | ||
| "if you use dict, the index should start from 0" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
eval_set is not None is no longer a reliable test of "validation data provided", now that it can be provided via eval_X and eval_y instead. I did the following with this:
- changed the check here in
LGBMRankerto account for all ofeval_{set,X,y} - moved the check about the size into
LGBMModel.fit(), so thelen()calls happen AFTER_validate_eval_set_Xy() - deleted the code checking if
eval_groupis a dictionary... supplyingeval_groupas a dictionary is not supported - improved these error messages a bit, since I was touching them anyway
- added new test cases covering these ranker-specific codepaths
| np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X)) | ||
| assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
Added this check on the validation results. Just checking the predicted values is not sufficient to test that the passed validation sets were actually used.
| X_test1, X_test2 = X_test[: n // 2], X_test[n // 2 :] | ||
| y_test1, y_test2 = y_test[: n // 2], y_test[n // 2 :] | ||
| gbm1 = lgb.LGBMRegressor(**params) | ||
| with pytest.warns(LGBMDeprecationWarning, match="The argument 'eval_set' is deprecated.*"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
Added these with pytest.warns() to suppress these warnings in test logs.
It's still valuable, I think, to have the standalone test_eval_set_deprecation test.
| assert set(gbm2.evals_result_.keys()) == {"valid_0", "valid_1"}, ( | ||
| f"expected 2 validation sets in evals_result_, got {gbm2.evals_result_.keys()}" | ||
| ) | ||
| assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0]) | ||
| assert gbm1.evals_result_["valid_1"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_1"]["l2"][0]) | ||
| assert gbm2.evals_result_["valid_0"]["l2"] != gbm2.evals_result_["valid_1"]["l2"], ( | ||
| "Evaluation results for the 2 validation sets are not different. This might mean they weren't both used." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
Expanded these tests so they could catch more possible bugs, like:
- one of the validation sets was ignored
- the same validation set was referenced multiple times (instead of them being considered separately)
| gbm.fit(X_train, y_train, eval_X=(X_test,) * 3, eval_y=(y_test,) * 2) | ||
|
|
||
|
|
||
| def test_ranker_eval_set_raises(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(changed by me in d7e0fff)
This test cases checks all of the validation of eval_group that I shuffled around (see https://github.com/microsoft/LightGBM/pull/6857/files#r2650232387)
|
Ok done adding inline comment threads. One other note... as you probably expect, the |
|
I've updated this to latest |
StrikerRUS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lorentzenchr Thank you so much for your hard work here! Generally LGTM! But please consider checking some my minor comments below.
python-package/lightgbm/sklearn.py
Outdated
| eval_set : list or None, optional (default=None) (deprecated) | ||
| A list of (X, y) tuple pairs to use as validation sets. | ||
| This is deprecated, use `eval_X` and `eval_y` instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that deprecated directive will suit better.
https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-deprecated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this in feafe22
And looks like it's now .. version-deprecated:: not .. deprecated::
Changed in version 9.0: The deprecated directive was renamed to version-deprecated. The previous name is retained as an alias
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait re-reading that.... Sphinx 9.0 is very new (November 2025).
I just pushed 2924d68 switching back to .. deprecated.
I'll test using Sphinx 9.x in a separate PR, on a LightGBM branch where we can test how it's rendered on readthedocs.
Co-authored-by: Nikita Titov <[email protected]>
Co-authored-by: James Lamb <[email protected]>
|
Ok @StrikerRUS I think I've addressed all comments, could you take another look and merge this if you approve? |
As discussed in scikit-learn/scikit-learn#28901 (comment), this PR adds
eval_Xandeval_yin order to make LGBM estimators compatible with scikit-learn's (as of version 1.6)Pipeline(..., transform_input=["eval_X"]).See also scikit-learn/scikit-learn#27124.