[python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set #6857

lorentzenchr · 2025-03-09T14:04:40Z

As discussed in scikit-learn/scikit-learn#28901 (comment), this PR adds eval_X and eval_y in order to make LGBM estimators compatible with scikit-learn's (as of version 1.6) Pipeline(..., transform_input=["eval_X"]).

See also scikit-learn/scikit-learn#27124.

jameslamb

Thanks! It's looking like you're struggling to get this passing CI, so I'm going to put into "draft" for now. @ me any time here if you need help with development, and we can open this back up for review once CI is passing.

I saw you had multiple commits responding to linting errors... here's how to run those locally for faster feedback:

# (or conda, whatever you want)
pip install pre-commit
pre-commit run --all-files

And here's how to build locally and run the tests:

# step 1: compile lib_ligihgbtm
# (only need to do this once, because you're not making any C/C++ changes)
cmake -B build -S .
cmake --build build --target _lightgbm -j4

# step 2: install the Python package, re-using it
# (do this every time you change Python code in the library)
sh build-python.sh install --precompile

# step 3: run the scikit-learn tests
pytest tests/python_package_test/test_sklearn.py

lorentzenchr · 2025-03-10T07:01:21Z

@jameslamb Thanks for your suggestions.
Could you already comment on the deprecation strategy, raising a warning?
Then, should I adapt all the (scikit-learn api) python tests and replace eval_set by the new eval_X, eval_y (thereby avoiding the warnings in the tests)?

jameslamb · 2025-03-10T14:49:07Z

Could you already comment on the deprecation strategy, raising a warning?

Making both options available for a time and raising a deprecation warning when eval_set if non-empty seems fine to me, if we decide to move forward with this. I'd also support a runtime error when both eval_set and eval_X are non-empty, to avoid taking on the complexity of merging those 2 inputs.

I'm sorry but I cannot invest much time in this right now (for example, looking into whether this would introduce inconsistencies with HistGradientBoostingClassifier, XGBoost, or CatBoost). If you want to see this change, getting CI working and then opening it up for others to review is probably the best path.

should I adapt all the (scikit-learn api) python tests and replace eval_set by the new eval_X, eval_y (thereby avoiding the warnings in the tests)?

No, please. As I said in scikit-learn/scikit-learn#28901 (comment), removing eval_set from LightGBM's scikit-learn estimators would be highly disruptive and requires a long deprecation cycle (more than a year, in my opinion). Throughout that time, we need to continue to test it at least as thoroughly as we have been.

lorentzenchr · 2025-03-10T19:41:42Z

@jameslamb I'm sorry, I really need a maintainer's help. The tests in tests/python_package_test/test_dask.py fail even on the master branch, locally on my computer. I tried to play with different versions of dask, numpy, scipy, scikit-learn, no success.
TLDR: CI failure seems to be in master and not caused by this PR.

Details

pytest -x tests/python_package_test/test_dask.py::test_ranker

...
>           dask_ranker = dask_ranker.fit(dX, dy, sample_weight=dw, group=dg)

tests/python_package_test/test_dask.py:714: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:1566: in fit
    self._lgb_dask_fit(
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:1068: in _lgb_dask_fit
    model = _train(
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:811: in _train
    results = client.gather(futures_classifiers)
../python3_lgbm/lib/python3.11/site-packages/distributed/client.py:2565: in gather
    return self.sync(
../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:205: in _train_part
    data = _concat([x["data"] for x in list_of_parts])

...

 def _concat(seq: List[_DaskPart]) -> _DaskPart:
        if isinstance(seq[0], np.ndarray):
            return np.concatenate(seq, axis=0)
        elif isinstance(seq[0], (pd_DataFrame, pd_Series)):
            return concat(seq, axis=0)
        elif isinstance(seq[0], ss.spmatrix):
            return ss.vstack(seq, format="csr")
        else:
>           raise TypeError(
                f"Data must be one of: numpy arrays, pandas dataframes, sparse matrices (from scipy). Got {type(seq[0]).__name__}."
            )
E           TypeError: Data must be one of: numpy arrays, pandas dataframes, sparse matrices (from scipy). Got tuple.

../python3_lgbm/lib/python3.11/site-packages/lightgbm/dask.py:159: TypeError

jameslamb · 2025-03-10T20:11:58Z

What versions of dask / distributed do you have installed?

Searching that error in the issue tracker here has a match: #6739

I suspect you need to pin to dask<2024.12 in your environment, as we do in CI here (#6742).

lorentzenchr · 2025-03-10T20:22:44Z

@jameslamb Thank you so much. Pinning dask<2024.12 worked fine.

lorentzenchr · 2025-03-10T21:35:52Z

The remaining CI failures seem unrelated.
TODO for myself: Improve test coverage a bit.

jameslamb · 2025-09-14T04:45:47Z

TODO for myself: Improve test coverage a bit.

@lorentzenchr if you are interested in continuing this I'd be happy to help with reviews. I'm supportive of adding this, for better compatibility with newer versions of scikit-learn.

lorentzenchr · 2025-12-04T06:27:12Z

@jameslamb Yes, I‘d like to finish this. Your review would be great. Anything from me you need before you can start reviewing?

jameslamb · 2025-12-04T15:08:39Z

Great! I'd been waiting to review until you were done adding whatever tests you wanted.

If you'd like a review before then, update this to latest master and get CI passing (especially the check that LightGBM works with its oldest support scikit-learn version), then @ me and I'll be happy to provide some feedback.

jameslamb · 2025-12-29T05:36:49Z

examples/python-guide/sklearn_example.py

 # train
 gbm = lgb.LGBMRegressor(num_leaves=31, learning_rate=0.05, n_estimators=20)
-gbm.fit(X_train, y_train, eval_set=[(X_test, y_test)], eval_metric="l1", callbacks=[lgb.early_stopping(5)])
+gbm.fit(X_train, y_train, eval_X=(X_test,), eval_y=(y_test,), eval_metric="l1", callbacks=[lgb.early_stopping(5)])


(changed by me in d7e0fff)

If we're going to consider eval_set deprecated and eval_{X,y} the new recommended pattern, l think we should nudge users towards that by updating all of the documentation. I've done that here (examples/ is the only place with such code).

I think eval_X=X_test, eval_y=y_test without wrapping into a tuple is fine, too.

For sure, either will work. I have a very weak preference for the tuple form in these docs, to make it a little clearer that providing multiple validation sets is supported.

python-package/lightgbm/sklearn.py

jameslamb · 2025-12-29T05:39:37Z

python-package/lightgbm/sklearn.py

+    if eval_set is None and eval_X is not None:
+        if isinstance(eval_X, tuple) != isinstance(eval_y, tuple):
+            raise ValueError("If eval_X is a tuple, y_val must be a tuple of same length, and vice versa.")
+        if isinstance(eval_X, tuple) and isinstance(eval_y, tuple):


(changed by me in d7e0fff)

This and isinstance(eval_y, tuple) seems redundant given all the previous conditions, but mypy needs it to understand that eval_y is not None at this point. Otherwise, it reports these new errors:

sklearn.py:515: error: Argument 1 to "len" has incompatible type "list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any | tuple[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any] | None"; expected "Sized" [arg-type] sklearn.py:518: error: Argument 2 to "zip" has incompatible type "list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any | tuple[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any] | None"; expected "Iterable[Any]" [arg-type] sklearn.py:518: error: Argument 2 to "zip" has incompatible type "list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any | tuple[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any | Any | Any | Any] | None"; expected "Iterable[list[float] | list[int] | ndarray[tuple[Any, ...], dtype[Any]] | Any]" [arg-type]

jameslamb · 2025-12-29T05:41:53Z

python-package/lightgbm/sklearn.py

+        if isinstance(eval_set, tuple):
+            return [eval_set]
+        else:
+            return eval_set


(changed by me in d7e0fff)

Although providing something like eval_set=(X_valid, y_valid) conflicts with the type hints and docs:

LightGBM/python-package/lightgbm/sklearn.py

Line 914 in 544d439

eval_set: Optional[List[_LGBM_ScikitValidSet]] = None,

... it has been supported in lightgbm for a long time:

LightGBM/python-package/lightgbm/sklearn.py

Lines 990 to 992 in 544d439

if eval_set is not None:

if isinstance(eval_set, tuple):

eval_set = [eval_set]

Adding this line preserves that behavior. Changing the existing behavior when eval_set is provided is outside the scope of this PR (other than raising deprecation warnings).

jameslamb · 2025-12-29T05:44:25Z

python-package/lightgbm/sklearn.py

-        if eval_set is not None:
-            if eval_group is None:
-                raise ValueError("Eval_group cannot be None when eval_set is not None")
-            if len(eval_group) != len(eval_set):
-                raise ValueError("Length of eval_group should be equal to eval_set")
-            if (
-                isinstance(eval_group, dict)
-                and any(i not in eval_group or eval_group[i] is None for i in range(len(eval_group)))
-                or isinstance(eval_group, list)
-                and any(group is None for group in eval_group)
-            ):
-                raise ValueError(
-                    "Should set group for all eval datasets for ranking task; "
-                    "if you use dict, the index should start from 0"
-                )


(changed by me in d7e0fff)

eval_set is not None is no longer a reliable test of "validation data provided", now that it can be provided via eval_X and eval_y instead. I did the following with this:

changed the check here in LGBMRanker to account for all of eval_{set,X,y}

moved the check about the size into LGBMModel.fit(), so the len() calls happen AFTER _validate_eval_set_Xy()

deleted the code checking if eval_group is a dictionary... supplying eval_group as a dictionary is not supported

improved these error messages a bit, since I was touching them anyway

added new test cases covering these ranker-specific codepaths

jameslamb · 2025-12-29T05:45:25Z

tests/python_package_test/test_sklearn.py

+    np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))
+    assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0])


(changed by me in d7e0fff)

Added this check on the validation results. Just checking the predicted values is not sufficient to test that the passed validation sets were actually used.

jameslamb · 2025-12-29T05:46:00Z

tests/python_package_test/test_sklearn.py

+    X_test1, X_test2 = X_test[: n // 2], X_test[n // 2 :]
+    y_test1, y_test2 = y_test[: n // 2], y_test[n // 2 :]
+    gbm1 = lgb.LGBMRegressor(**params)
+    with pytest.warns(LGBMDeprecationWarning, match="The argument 'eval_set' is deprecated.*"):


(changed by me in d7e0fff)

Added these with pytest.warns() to suppress these warnings in test logs.

It's still valuable, I think, to have the standalone test_eval_set_deprecation test.

jameslamb · 2025-12-29T05:47:07Z

tests/python_package_test/test_sklearn.py

+    assert set(gbm2.evals_result_.keys()) == {"valid_0", "valid_1"}, (
+        f"expected 2 validation sets in evals_result_, got {gbm2.evals_result_.keys()}"
+    )
+    assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0])
+    assert gbm1.evals_result_["valid_1"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_1"]["l2"][0])
+    assert gbm2.evals_result_["valid_0"]["l2"] != gbm2.evals_result_["valid_1"]["l2"], (
+        "Evaluation results for the 2 validation sets are not different. This might mean they weren't both used."
+    )


(changed by me in d7e0fff)

Expanded these tests so they could catch more possible bugs, like:

one of the validation sets was ignored

the same validation set was referenced multiple times (instead of them being considered separately)

jameslamb · 2025-12-29T05:48:23Z

tests/python_package_test/test_sklearn.py

+        gbm.fit(X_train, y_train, eval_X=(X_test,) * 3, eval_y=(y_test,) * 2)
+
+
+def test_ranker_eval_set_raises():


(changed by me in d7e0fff)

This test cases checks all of the validation of eval_group that I shuffled around (see https://github.com/microsoft/LightGBM/pull/6857/files#r2650232387)

jameslamb · 2025-12-29T05:51:03Z

Ok done adding inline comment threads. One other note... as you probably expect, the R-package / rchck job failures are not related to changes here. That's tracked in #7113. If it isn't resolved soon, we'll skip that CI check so this can be merged.

jameslamb · 2026-01-08T04:04:04Z

I've updated this to latest master to pull in CI fixes. I'm still hoping another maintainer will find time to review this, sorry for the delay and thanks for your patience.

StrikerRUS

@lorentzenchr Thank you so much for your hard work here! Generally LGTM! But please consider checking some my minor comments below.

python-package/lightgbm/dask.py

StrikerRUS · 2026-01-15T09:26:07Z

python-package/lightgbm/sklearn.py

+    eval_set : list or None, optional (default=None) (deprecated)
        A list of (X, y) tuple pairs to use as validation sets.
+        This is deprecated, use `eval_X` and `eval_y` instead.


I believe that deprecated directive will suit better.

https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-deprecated

I did this in feafe22

And looks like it's now .. version-deprecated:: not .. deprecated::

Changed in version 9.0: The deprecated directive was renamed to version-deprecated. The previous name is retained as an alias

Oh wait re-reading that.... Sphinx 9.0 is very new (November 2025).

I just pushed 2924d68 switching back to .. deprecated.

I'll test using Sphinx 9.x in a separate PR, on a LightGBM branch where we can test how it's rendered on readthedocs.

python-package/lightgbm/sklearn.py

tests/python_package_test/test_sklearn.py

Co-authored-by: Nikita Titov <[email protected]>

Co-authored-by: James Lamb <[email protected]>

…into sklearn_Xval_yval

python-package/lightgbm/sklearn.py

…into sklearn_Xval_yval

jameslamb · 2026-01-18T03:50:11Z

Ok @StrikerRUS I think I've addressed all comments, could you take another look and merge this if you approve?

lorentzenchr requested review from StrikerRUS, borchero, guolinke, jameslamb, jmoralez and shiyu1994 as code owners March 9, 2025 14:04

lorentzenchr force-pushed the sklearn_Xval_yval branch 2 times, most recently from 197a3bd to 5dd3171 Compare March 9, 2025 20:13

jameslamb added in progress feature labels Mar 9, 2025

jameslamb reviewed Mar 9, 2025

View reviewed changes

jameslamb marked this pull request as draft March 9, 2025 23:15

lorentzenchr marked this pull request as ready for review March 10, 2025 21:34

jameslamb changed the title ~~ENH add eval_X, eval_y, deprecate eval_set~~ [python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set Mar 10, 2025

lorentzenchr mentioned this pull request Mar 18, 2025

ENH add X_val and y_val to HGBT.fit scikit-learn/scikit-learn#27124

Merged

jameslamb added the awaiting response label Sep 14, 2025

github-actions bot removed the awaiting response label Dec 4, 2025

lorentzenchr added 4 commits December 5, 2025 08:22

ENH add eval_X, eval_y, deprecate eval_set

5f27a57

CLN eval_X, eval_y in all fit signatures

79665bb

ENH eval_X and eval_y in _DaskLGBMModel

00022f7

CLN appease ruff/linting

d79f545

jameslamb reviewed Dec 29, 2025

View reviewed changes

python-package/lightgbm/sklearn.py Show resolved Hide resolved

jameslamb reviewed Dec 29, 2025

View reviewed changes

jameslamb added awaiting review and removed in progress labels Dec 29, 2025

Merge branch 'master' into sklearn_Xval_yval

314ace1

StrikerRUS reviewed Jan 15, 2026

View reviewed changes

jameslamb and others added 2 commits January 15, 2026 10:30

Apply suggestions from code review

e6790bc

Co-authored-by: Nikita Titov <[email protected]>

Update tests/python_package_test/test_sklearn.py

6aa50ee

Co-authored-by: James Lamb <[email protected]>

jameslamb mentioned this pull request Jan 17, 2026

release v4.7.0 #7129

Open

39 tasks

jameslamb added 2 commits January 17, 2026 21:34

Merge branch 'master' into sklearn_Xval_yval

cd5aa82

Merge branch 'sklearn_Xval_yval' of github.com:lorentzenchr/LightGBM …

4da786a

…into sklearn_Xval_yval

jameslamb reviewed Jan 18, 2026

View reviewed changes

python-package/lightgbm/sklearn.py Outdated Show resolved Hide resolved

jameslamb added 5 commits January 17, 2026 21:42

Update python-package/lightgbm/sklearn.py

ed448b5

use version-deprecated, fix formatting

feafe22

Merge branch 'sklearn_Xval_yval' of github.com:lorentzenchr/LightGBM …

2bab015

…into sklearn_Xval_yval

use .. deprecated

2924d68

keyword-only arguments

e4ad4e4

jameslamb requested a review from StrikerRUS January 18, 2026 03:49

fix docs indentation

e7422ea

jameslamb mentioned this pull request Jan 18, 2026

WIP: [docs] [ci] always use Sphinx >=9 #7132

Draft

1 task

	if eval_set is not None:
	if isinstance(eval_set, tuple):
	eval_set = [eval_set]

		np.testing.assert_allclose(gbm1.predict(X), gbm2.predict(X))
		assert gbm1.evals_result_["valid_0"]["l2"][0] == pytest.approx(gbm2.evals_result_["valid_0"]["l2"][0])

		gbm.fit(X_train, y_train, eval_X=(X_test,) * 3, eval_y=(y_test,) * 2)


		def test_ranker_eval_set_raises():

[python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set #6857

Are you sure you want to change the base?

[python-package] scikit-learn fit() methods: add eval_X, eval_y, deprecate eval_set #6857

Conversation

lorentzenchr commented Mar 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jameslamb left a comment

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

jameslamb commented Mar 10, 2025

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

jameslamb commented Mar 10, 2025

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

lorentzenchr commented Mar 10, 2025

Uh oh!

jameslamb commented Sep 14, 2025

Uh oh!

lorentzenchr commented Dec 4, 2025

Uh oh!

jameslamb commented Dec 4, 2025

Uh oh!

jameslamb Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jameslamb commented Dec 29, 2025

Uh oh!

jameslamb commented Jan 8, 2026

Uh oh!

StrikerRUS left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jameslamb commented Jan 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lorentzenchr commented Mar 9, 2025 •

edited

Loading

jameslamb Dec 29, 2025 •

edited

Loading