Skip to content

Natural Gradient Boosting for Probabilistic Prediction

License

Notifications You must be signed in to change notification settings

joseortiz3/ngboost

This branch is 5 commits ahead of, 18 commits behind stanfordmlgroup/ngboost:master.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

1d30583 · Feb 14, 2024
Feb 13, 2024
Sep 5, 2019
Nov 10, 2020
Nov 20, 2023
Oct 11, 2022
Feb 14, 2024
Sep 4, 2019
Nov 10, 2020
Feb 14, 2024
Nov 10, 2020
Jan 29, 2024
Nov 10, 2020
Sep 2, 2020
Oct 11, 2022
Nov 20, 2023
Feb 14, 2024
Feb 14, 2024
Sep 25, 2020
Nov 10, 2020
Feb 10, 2021

Repository files navigation

NGBoost: Natural Gradient Boosting for Probabilistic Prediction

Python package GitHub Repo Size Github License Code style: black PyPI PyPI Downloads

ngboost is a Python library that implements Natural Gradient Boosting, as described in "NGBoost: Natural Gradient Boosting for Probabilistic Prediction". It is built on top of Scikit-Learn, and is designed to be scalable and modular with respect to choice of proper scoring rule, distribution, and base learner. A didactic introduction to the methodology underlying NGBoost is available in this slide deck.

Installation

via pip

pip install --upgrade ngboost

via conda-forge

conda install -c conda-forge ngboost

Usage

Probabilistic regression example on the Boston housing dataset:

from ngboost import NGBRegressor

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

#Load Boston housing dataset
data_url = "http://lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
X = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
Y = raw_df.values[1::2, 2]

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2)

ngb = NGBRegressor().fit(X_train, Y_train)
Y_preds = ngb.predict(X_test)
Y_dists = ngb.pred_dist(X_test)

# test Mean Squared Error
test_MSE = mean_squared_error(Y_preds, Y_test)
print('Test MSE', test_MSE)

# test Negative Log Likelihood
test_NLL = -Y_dists.logpdf(Y_test).mean()
print('Test NLL', test_NLL)

Details on available distributions, scoring rules, learners, tuning, and model interpretation are available in our user guide, which also includes numerous usage examples and information on how to add new distributions or scores to NGBoost.

License

Apache License 2.0.

Reference

Tony Duan, Anand Avati, Daisy Yi Ding, Khanh K. Thai, Sanjay Basu, Andrew Y. Ng, Alejandro Schuler. 2019. NGBoost: Natural Gradient Boosting for Probabilistic Prediction. arXiv

About

Natural Gradient Boosting for Probabilistic Prediction

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 96.3%
  • Shell 3.5%
  • Makefile 0.2%