⏳ time-series-forecasting-wiki

This repository contains a series of analysis, transforms and forecasting models frequently used when dealing with time series. The aim of this repository is to showcase how to model time series from the scratch, for this we are using a real usecase dataset (Beijing air polution dataset to avoid perfect use cases far from reality that are often present in this types of tutorials. If you want to rerun the notebooks make sure you install al neccesary dependencies, Guide

You can find the more detailed toc on the main notebook

📂 Dataset

The dataset used is the Beijing air quality public dataset. This dataset contains polution data from 2014 to 2019 sampled every 10 minutes along with extra weather features such as preassure, temperature etc. We decided to resample the dataset with daily frequency for both easier data handling and proximity to a real use case scenario (no one would build a model to predict polution 10 minutes ahead, 1 day ahead looks more realistic). In this case the series is already stationary with some small seasonalities which change every year #MORE ONTHIS

In order to obtain a exact copy of the dataset used in this tutorial please run the script under datasets/download_datasets.py which will automatically download the dataset and preprocess it for you.

📚 Analysis and transforms

Time series decomposition
- Level
- Trend
- Seasonality
- Noise
Stationarity
- AC and PAC plots
- Rolling mean and std
- Dickey-Fuller test
Making our time series stationary
- Difference transform
- Log scale
- Smoothing
- Moving average

📐 Models tested

Autoregression (AR)
Moving Average (MA)
Autoregressive Moving Average (ARMA)
Autoregressive integraded moving average (ARIMA)
Seasonal autoregressive integrated moving average (SARIMA)
Bayesian regression Link
Lasso Link
SVM Link
Randomforest Link
Nearest neighbors Link
XGBoost Link
Lightgbm Link
Prophet Link
Long short-term memory with tensorflow (LSTM)Link
DeepAR

🔍 Forecasting results

We will devide our results wether the extra features columns such as temperature or preassure were used by the model as this is a huge step in metrics and represents two different scenarios. Metrics used were:

Evaluation Metrics

Mean Absolute Error (MAE)
Mean Absolute Percentage Error (MAPE)
Root Mean Squared Error (RMSE)
Coefficient of determination (R2)

Model	mae	rmse	mape	r2
EnsembleXG+TF	27.64	40.23	0.42	0.76
EnsembleLIGHT+TF	27.34	39.27	0.42	0.77
EnsembleXG+LIGHT+TF	27.63	39.69	0.44	0.76
EnsembleXG+LIGHT	29.95	42.7	0.52	0.73
Randomforest tunned	40.79	53.2	0.9	0.57
SVM RBF GRID SEARCH	38.57	50.34	0.78	0.62
DeepAR	71.37	103.97	0.96	-0.63
Tensorflow simple LSTM	30.13	43.08	0.42	0.72
Prophet multivariate	38.25	50.45	0.74	0.62
Kneighbors	57.05	80.39	1.08	0.03
SVM RBF	40.81	56.03	0.79	0.53
Lightgbm	30.21	42.76	0.52	0.72
XGBoost	32.13	45.59	0.56	0.69
Randomforest	45.84	59.45	1.03	0.47
Lasso	39.24	54.58	0.71	0.55
BayesianRidge	39.24	54.63	0.71	0.55
Prophet univariate	61.33	83.64	1.26	-0.05
AutoSARIMAX (1, 0, 1),(0, 0, 0, 6)	51.29	71.49	0.91	0.23
SARIMAX	51.25	71.33	0.91	0.23
AutoARIMA (0, 0, 3)	47.01	64.71	1.0	0.37
ARIMA	48.25	66.39	1.06	0.34
ARMA	47.1	64.86	1.01	0.37
MA	49.04	66.2	1.05	0.34
AR	47.24	65.32	1.02	0.36
HWES	52.96	74.67	1.11	0.16
SES	52.96	74.67	1.11	0.16
Yesterdays value	52.67	74.52	1.04	0.16
Naive mean	59.38	81.44	1.32	-0.0

Additional resources and literature

Models not tested but that are gaining popularity

There are several models we have not tried in this tutorials as they come from the academic world and their implementation is not 100% reliable, but is worth mentioning them:

Neural basis expansion analysis for interpretable time series forecasting (N-BEATS) | link Code
ESRRN link Code


Adhikari, R., & Agrawal, R. K. (2013). An introductory study on time series modeling and forecasting	[1]
Introduction to Time Series Forecasting With Python	[2]
Deep Learning for Time Series Forecasting	[3]
The Complete Guide to Time Series Analysis and Forecasting	[4]
How to Decompose Time Series Data into Trend and Seasonality	[5]

Contributing

Want to see another model tested? Do you have anything to add or fix? I'll be happy to talk about it! Open an issue/PR :)

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github/workflows		.github/workflows
datasets		datasets
docs		docs
results		results
utils		utils
.gitignore		.gitignore
01-Analysis&transforms.ipynb		01-Analysis&transforms.ipynb
02-Forecasting_models.ipynb		02-Forecasting_models.ipynb
03-Results_analysis&discussion.ipynb		03-Results_analysis&discussion.ipynb
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
time-series-forecasting-tutorial.ipynb		time-series-forecasting-tutorial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⏳ time-series-forecasting-wiki

📂 Dataset

📚 Analysis and transforms

📐 Models tested

🔍 Forecasting results

Evaluation Metrics

Additional resources and literature

Models not tested but that are gaining popularity

Contributing

About

Releases

Packages

Languages

japjeet26/time-series-forecasting-with-python

Folders and files

Latest commit

History

Repository files navigation

⏳ time-series-forecasting-wiki

📂 Dataset

📚 Analysis and transforms

📐 Models tested

🔍 Forecasting results

Evaluation Metrics

Additional resources and literature

Models not tested but that are gaining popularity

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages