Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issues with predict Call in XGBoost (v2.1.1) #11054

Open
isslerman opened this issue Dec 4, 2024 · 1 comment
Open

Performance Issues with predict Call in XGBoost (v2.1.1) #11054

isslerman opened this issue Dec 4, 2024 · 1 comment

Comments

@isslerman
Copy link

Hello XGBoosters!

I was looking for the official forum (https://discuss.xgboost.ai/) to discuss this issue, but it seems the link is no longer working. Does anyone know if it's still active or if there's another community hub for discussions?

Now, to the issue:

I’m running a live simulation using an XGBoost model in Python. During the simulation, I call the following code to make predictions:

# Make prediction
prediction = model.predict(dmatrix)

After profiling, I found that this predict call takes ~400ms on average, which is significantly impacting the performance of my application. I’m using XGBoost v2.1.1.

My goal: I need to reduce the prediction time to around 30ms. A 10x speed-up is essential for my use case.

What I’ve considered so far:

  • Downgrading XGBoost: Some have reported that v1.7.6 is faster.
  • Using ONNX Runtime: Converting the model and running it with ONNX Runtime.
  • Switching to Another Language: Implementing predictions in languages like Go or C for better performance.
  • Caching Solutions: Investigating caching mechanisms for repeated predictions.

Here's a screenshot of the profiling results for reference:

Screenshot 2024-12-04 at 11 25 12

How I can run 10x more fast this predict? like about 30ms? Some solutions I found but have not tried yet:

  • Change to old versions of XGBoost, V1.7.6 seens fast.
  • Use onnxruntime solution
  • Use another language like Go, C...
  • Some way to cache

Questions:

Has anyone else encountered similar performance issues with model.predict in XGBoost v2.1.1?
Are there specific settings or optimizations within XGBoost that could reduce prediction time?
Which of the options above would you recommend trying first? Or are there alternative solutions I haven’t considered?
Thanks in advance for any advice or insights!

@trivialfis
Copy link
Member

trivialfis commented Dec 4, 2024

Could you please extract a snippet of your inference code and the parameters used to train the model, along with data shape and sparsity?

This needs to be resolved case by case, for example, #10882 is caused by inspecting the pandas dataframe, which might be the same cause here since you are observing 1.7 being faster.

In addition, inplace_predict is also helpful, along with GPU-based inputs, etc, depending on your use case.

If you prefer a dedicated inference engine, you might want to try the FIL project (disclosure: I work for the same team as cuML).

cc @wphicks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants