Replies: 2 comments 1 reply
-
@sokol11 Thanks for your interest in the project. You can sample from the fitted/predicted distribution, but this requires to train a model. You can use the time information as features. Here is a small example # Imports
import pandas as pd
from xgboostlss.model import *
from xgboostlss.distributions.distribution_utils import DistributionClass
from xgboostlss.distributions import *
from xgboostlss.distributions.LogNormal import *
import matplotlib.pyplot as plt
import seaborn as sns
# Read Data
airp = pd.read_csv("https://raw.githubusercontent.com/Manishms18/Air-Passengers-Time-Series-Analysis/master/Data/AirPassengers.csv")
# Extract month and year features
years = []
months = []
for item in airp["Month"]:
year, month = item.split("-")
years.append(int(year))
months.append(int(month))
airp["month_feat"] = months
airp["year_feat"] = years
airp["time_feat"] = airp.index + 1
target = airp["Passengers"].values
features = airp.filter(regex="_feat")
# Create xgb.DMatrix
dtrain = xgb.DMatrix(features, label=target)
# Specify Distribution
xgblss = XGBoostLSS(LogNormal(response_fn="softplus"))
# Train Model
params={"eta": 0.3}
xgblss.train(params,
dtrain,
num_boost_round=20
)
# Sample from predicted distribution
dist_samples = xgblss.predict(dtrain,
pred_type="samples",
n_samples=100,
seed=123)
# Plot Samples
sns.lineplot(dist_samples, legend=None)
plt.title("Samples from fitted distribution")
plt.show() |
Beta Was this translation helpful? Give feedback.
1 reply
-
XGBoostLSS is a wrapper around xgboost, so that it has full functionality, so you can specify GPU usage as in any other xgboost setting. However, GPU support depends on whether or not it supports custom loss functions. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi. I have a multivariate timeseries dataset, and I'm trying to find a way to efficiently make synthetic data based on it, i.e., I'd like to fit a multivariate timeseries model to my dataset and then sample it. It sounds like XGBoostLSS can be used in that way, provided I have the associated
x
variables. Is there a way to use the library to fit and sample a model, based on the multivariatey
data only? Thank you!Beta Was this translation helpful? Give feedback.
All reactions