Performance Issues with predict Call in XGBoost (v2.1.1) #11054

isslerman · 2024-12-04T15:18:58Z

Hello XGBoosters!

I was looking for the official forum (https://discuss.xgboost.ai/) to discuss this issue, but it seems the link is no longer working. Does anyone know if it's still active or if there's another community hub for discussions?

Now, to the issue:

I’m running a live simulation using an XGBoost model in Python. During the simulation, I call the following code to make predictions:

# Make prediction
prediction = model.predict(dmatrix)

After profiling, I found that this predict call takes ~400ms on average, which is significantly impacting the performance of my application. I’m using XGBoost v2.1.1.

My goal: I need to reduce the prediction time to around 30ms. A 10x speed-up is essential for my use case.

What I’ve considered so far:

Downgrading XGBoost: Some have reported that v1.7.6 is faster.
Using ONNX Runtime: Converting the model and running it with ONNX Runtime.
Switching to Another Language: Implementing predictions in languages like Go or C for better performance.
Caching Solutions: Investigating caching mechanisms for repeated predictions.

Here's a screenshot of the profiling results for reference:

How I can run 10x more fast this predict? like about 30ms? Some solutions I found but have not tried yet:

Change to old versions of XGBoost, V1.7.6 seens fast.
Use onnxruntime solution
Use another language like Go, C...
Some way to cache

Questions:

Has anyone else encountered similar performance issues with model.predict in XGBoost v2.1.1?
Are there specific settings or optimizations within XGBoost that could reduce prediction time?
Which of the options above would you recommend trying first? Or are there alternative solutions I haven’t considered?
Thanks in advance for any advice or insights!

trivialfis · 2024-12-04T16:00:38Z

Could you please extract a snippet of your inference code and the parameters used to train the model, along with data shape and sparsity?

This needs to be resolved case by case, for example, #10882 is caused by inspecting the pandas dataframe, which might be the same cause here since you are observing 1.7 being faster.

In addition, inplace_predict is also helpful, along with GPU-based inputs, etc, depending on your use case.

If you prefer a dedicated inference engine, you might want to try the FIL project (disclosure: I work for the same team as cuML).

cc @wphicks

trivialfis added the status: need update label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Issues with predict Call in XGBoost (v2.1.1) #11054

Performance Issues with predict Call in XGBoost (v2.1.1) #11054

isslerman commented Dec 4, 2024

trivialfis commented Dec 4, 2024 •

edited

Loading

Performance Issues with predict Call in XGBoost (v2.1.1) #11054

Performance Issues with predict Call in XGBoost (v2.1.1) #11054

Comments

isslerman commented Dec 4, 2024

trivialfis commented Dec 4, 2024 • edited Loading

trivialfis commented Dec 4, 2024 •

edited

Loading