diff --git a/README.md b/README.md index 60773b56..4367571c 100644 --- a/README.md +++ b/README.md @@ -120,6 +120,7 @@ In this repository you can find the several GAN architectures that are used to c ### Sequential data - [TimeGAN](https://papers.nips.cc/paper/2019/file/c9efe5f26cd17ba6216bbe2a7d26d490-Paper.pdf) + - [DoppelGANger](https://dl.acm.org/doi/pdf/10.1145/3419394.3423643) ## Contributing We are open to collaboration! If you want to start contributing you only need to: diff --git a/docs/examples/doppelganger_example.md b/docs/examples/doppelganger_example.md new file mode 100644 index 00000000..eedc2c53 --- /dev/null +++ b/docs/examples/doppelganger_example.md @@ -0,0 +1,21 @@ +# Synthesize time-series data + +**Using *DoppelGANger* to generate synthetic time-series data:** + +Although tabular data may be the most frequently discussed type of data, a great number of real-world domains โ€” from traffic and daily trajectories to stock prices and energy consumption patterns โ€” produce **time-series data** which introduces several aspects of complexity to synthetic data generation. + +Time-series data is structured sequentially, with observations **ordered chronologically** based on their associated timestamps or time intervals. It explicitly incorporates the temporal aspect, allowing for the analysis of trends, seasonality, and other dependencies over time. + +DoppelGANger is a model that uses a Generative Adversarial Network (GAN) framework to generate synthetic time series data by learning the underlying temporal dependencies and characteristics of the original data: + +- ๐Ÿ“‘ **Paper:** [Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions](https://dl.acm.org/doi/pdf/10.1145/3419394.3423643) + +Hereโ€™s an example of how to synthetize time-series data with DoppelGANger using the [Yahoo Stock Price](https://www.kaggle.com/datasets/arashnic/time-series-forecasting-with-yahoo-stock-price) dataset: + + +```python +--8<-- "examples/timeseries/stock_doppelganger.py" +``` + + + diff --git a/docs/index.md b/docs/index.md index 7c8b28a2..4bea54ac 100644 --- a/docs/index.md +++ b/docs/index.md @@ -56,3 +56,4 @@ The following architectures are currently supported: - [CWGAN-GP](https://cameronfabbri.github.io/papers/conditionalWGAN.pdf) (Conditional Wassertein GAN with Gradient Penalty) - [CTGAN](https://arxiv.org/pdf/1907.00503.pdf) (Conditional Tabular GAN) - [TimeGAN](https://papers.nips.cc/paper/2019/file/c9efe5f26cd17ba6216bbe2a7d26d490-Paper.pdf) (specifically for *time-series* data) +- [DoppelGANger](https://dl.acm.org/doi/pdf/10.1145/3419394.3423643) (specifically for *time-series* data) diff --git a/docs/reference/api/synthesizers/timeseries/doppelganger.md b/docs/reference/api/synthesizers/timeseries/doppelganger.md new file mode 100644 index 00000000..7a8c1696 --- /dev/null +++ b/docs/reference/api/synthesizers/timeseries/doppelganger.md @@ -0,0 +1,2 @@ + +::: ydata_synthetic.synthesizers.timeseries.doppelganger.model.DoppelGANger \ No newline at end of file diff --git a/examples/timeseries/stock_doppelganger.py b/examples/timeseries/stock_doppelganger.py index ec8d6bae..a2d47bce 100644 --- a/examples/timeseries/stock_doppelganger.py +++ b/examples/timeseries/stock_doppelganger.py @@ -28,6 +28,7 @@ else: model_dop_gan = TimeSeriesSynthesizer(modelname='doppelganger', model_parameters=model_args) model_dop_gan.fit(stock_data, train_args, num_cols=["Open", "High", "Low", "Close", "Adj_Close", "Volume"]) + model_dop_gan.save('doppelganger_stock') # Generating new synthetic samples synth_data = model_dop_gan.sample(n_samples=500) diff --git a/mkdocs.yml b/mkdocs.yml index bb373ef9..a922cb97 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -21,6 +21,7 @@ nav: - CWGAN-GP: "examples/cwgangp_example.md" - Generate Time-Series Data: - TimeGAN: "examples/timegan_example.md" + - DoppelGANger: "examples/doppelganger_example.md" - Frequently Asked Questions: "examples/faqs.md" - Integrations: - Great Expectations: "integrations/gx_integration.md" @@ -43,6 +44,7 @@ nav: - WGAN: 'reference/api/synthesizers/regular/wgan.md' - Timeseries: - TimeGAN: 'reference/api/synthesizers/timeseries/timegan.md' + - DoppelGANger: 'reference/api/synthesizers/timeseries/doppelganger.md' - Preprocessing: - BaseProcessor: 'reference/api/preprocessing/base.md' - Regular: