This repository contains the implementation of STCN: Stochastic Temporal Convolutional Networks paper.
The model is implemented in Python 3.5 by using Tensorflow library. Particularly the experiments were conducted with cuda 9, cudnn 7.2 and Tensorflow 1.10.1. Along with the implementation, we release our pretrained models achieving the state-of-the-art performance. You can reproduce the numbers and results by using the evaluation scripts explained below.
Python dependencies are listed in pip_packages
file. You can install the packages by running
pip install -r pip_packages
command.
You should download the Blizzard, TIMIT and IAM-OnDB datasets.
We compiled the preprocessing steps applied in the previous works.
You can find the scripts for Blizzard and TIMIT dataset in experiments_speech
directory. We will provide the script for IAM-OnDB handwriting dataset soon.
The Deepwriting dataset is already provided.
The instructions are provided in run_blizzard_data_scripts.py
and run_timit_data_scripts.py
files.
After you download the corresponding dataset, you just need to set the DATA_PATH
variable in the code file and run the script.
After the preprocessing steps are applied, the data is transformed into a numpy
representation required by our training pipeline.
It is basically a dictionary with samples
, targets
and data statistics
such as mean and variance.
It can be inspected by loading via numpy.load
.
We use configuration files to pass model and experiment parameters. You can find the configuration files in experiments_x/config_y
folders.
They are stored in JSON
format. The provided configurations are the same with what we used. However, due to the randomness you may get
better or worse results within a small margin.
You can train a model by passing a JSON configuration file. Most entries in the configuration are self-explanatory, and the rest is described in the code (see soruce/tf_models.py
model constructors).
You should also have a look into source/configuration.py
file which created command-line arguments and parses JSON configuration files.
For example the following command runs STCN-dense
model with GMM
outputs on Deepwriting
dataset.
python run_training.py
--experiment_name stcn_dense_gmm
--json_file ./config_deepwriting/stcn_dense_gmm.json
--training_data [PATH-TO]/deepwriting_v2_training.npz
--validation_data [PATH-TO]/deepwriting_v2_validation.npz
--save_dir [PATH-TO]/runs
--eval_dir [PATH-TO]/evaluation_runs
--pp_zero_mean_normalization
--run_evaluation_after_training
We provide example commands for other datasets in the respective run_training.py
file. Please check it out since the normalization flags may differ.
We apply early stopping on the validation ELBO performance. When training ends, the evaluation script is called if the run_evaluation_after_training
flag is passed.
You can evaluate a pre-trained model by using the following command:
python run_evaluation.py
--seed [if not passed, training seed is used]
--quantitative
--qualitative
--model_id [Unique timestamp ID in the experiment folder name. Something like 1549805923]
--save_dir [PATH-TO]/runs
--eval_dir [PATH-TO]/evaluation_runs
--test_data <PATH-TO>/blizzard_stcn_test.npz
Similarly, our models can be downloaded and evaluated by just passsing the unique model ID.
If you have questions on the code, feel free to create a Github issue or contact me.
If you use this code or dataset in your research, please cite us as follows:
@inproceedings{
aksan2018stcn,
title={{STCN}: Stochastic Temporal Convolutional Networks},
author={Emre Aksan and Otmar Hilliges},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=HkzSQhCcK7},
}