Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confusing error messages when passed tables with invalid types #154

Open
bobaronoff opened this issue Jan 5, 2023 · 7 comments
Open

confusing error messages when passed tables with invalid types #154

bobaronoff opened this issue Jan 5, 2023 · 7 comments

Comments

@bobaronoff
Copy link

received unusual error message trying to convert a DataFrame to DMatrix. Have in the past and currently convert other DataFrame object without issue. Not sure what is different with this object. Any clues via the error message or suggestion how to troubleshoot would be helpful.

Here is the error

julia> typeof(s)
DataFrame

julia> DMatrix(s)
ERROR: ArgumentError: DMatrix requires either an AbstractMatrix or table satisfying the Tables.jl interface
Stacktrace:
 [1] DMatrix(tbl::Matrix{Any}; feature_names::Vector{String}, kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ XGBoost ~/.julia/packages/XGBoost/Fyff4/src/dmatrix.jl:249
 [2] DMatrix(tbl::DataFrame; feature_names::Vector{String}, kw::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ XGBoost ~/.julia/packages/XGBoost/Fyff4/src/dmatrix.jl:251
 [3] DMatrix(tbl::DataFrame)
   @ XGBoost ~/.julia/packages/XGBoost/Fyff4/src/dmatrix.jl:244
 [4] top-level scope
   @ REPL[63]:1

@bobaronoff
Copy link
Author

additional introspection on DataFrame object:

julia> Tables.istable(s)
true

julia> Tables.columnnames(s)
35-element Vector{Symbol}:

@ExpandingMan
Copy link
Collaborator

This is expected behavior but a bad error message. The conversion to a matrix is resulting in something with eltype Any where it's expecting Real.

We probably should try to standardize the cases in which the Any elements get converted. It's certainly reasonable for it to fail in some cases, but it wouldn't surprise me if currently it fails in some cases that are not so reasonable.

Could you list all types present in your input tables? I think in your case it would be

Set(Iterators.map(typeof, Tables.matrix(s)))

@ExpandingMan
Copy link
Collaborator

On second thought, something else fishy is happening here. This specific error should only happen when !Tables.istable(s).

Also could you please try this on the latest version of XGBoost.jl? From your stack trace it looks like this is at least a few commits old.

@bobaronoff
Copy link
Author

Thank you for prompt response! I believe you hit the problem. A few of the columns in DataFrame object contain String type. I need to figure out the source of this issue on my end (object is end result of multiple conversions of string to numbers and these columns seems to have been missed).

Will recontact if this does not correct issue.

@ExpandingMan
Copy link
Collaborator

A few of the columns in DataFrame object contain String type

In that case this was definitely supposed to throw an error but this error message was pretty terrible and confused even me. So this is an actionable issue in that we need an improved error message here. I'd be happy if it hit a MethodError, but this is just downright confusing.

@bobaronoff
Copy link
Author

The string element was issue here. Leave it to your discretion if any changes in error message are needed.

Thanks for helping me sort out the issue.

@ExpandingMan
Copy link
Collaborator

Yes, in my opinion this message is sufficiently confusing that it warrants an open issue, so let's keep this open, though it's not really a high priority to fix it.

@ExpandingMan ExpandingMan changed the title Unusual error with DMatrix conversion confusing error messages when passed tables with invalid types Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants