Warning This code is UNOFFICIAL.
Paper: Attend and Diagnose: Clinical Time Series Analysis Using Attention Models
If you want to run this code, you need download some dataset and write experimenting code.
from comet_ml import Experiment
from SAnD.core.model import SAnD
from SAnD.utils.trainer import NeuralNetworkClassifier
model = SAnD( ... )
clf = NeuralNetworkClassifier( ... )
clf.fit( ... )
git clone https://github.com/khirotaka/SAnD.git
- Python 3.6
- Comet.ml
- PyTorch v1.1.0 or later
Here's a brief overview of how you can use this project to help you solve the classification task.
First, create an empty directory.
In this example, I'll call it "playground".
Run the git init
& git submodule add
command to register SAnD project as a submodule.
$ mkdir playground/
$ cd playground/
$ git init
$ git submodule add https://github.com/khirotaka/SAnD.git
Now you're ready to use SAnD
in your project.
Prepare the data set of your choice.
Remember that the input dimension to the SAnD model is basically three dimensions of [N, seq_len, features]
.
This example shows how to use torch.randn()
as a pseudo dataset.
from comet_ml import Experiment
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader
from SAnD.core.model import SAnD
from SAnD.utils.trainer import NeuralNetworkClassifier
x_train = torch.randn(1024, 256, 23) # [N, seq_len, features]
x_val = torch.randn(128, 256, 23) # [N, seq_len, features]
x_test = torch.randn(512, 256, 23) # [N, seq_len, features]
y_train = torch.randint(0, 9, (1024, ))
y_val = torch.randint(0, 9, (128, ))
y_test = torch.randint(0, 9, (512, ))
train_ds = TensorDataset(x_train, y_train)
val_ds = TensorDataset(x_val, y_val)
test_ds = TensorDataset(x_test, y_test)
train_loader = DataLoader(train_ds, batch_size=128)
val_loader = DataLoader(val_ds, batch_size=128)
test_loader = DataLoader(test_ds, batch_size=128)
Note:
In my experience, I have a feeling that SAnD
is better at problems with a large number of features
.
Finally, train the SAnD model using the included NeuralNetworkClassifier
.
Of course, you can also have them use a well-known training tool such as PyTorch Lightning.
The included NeuralNetworkClassifier
depends on the comet.ml's logging service.
in_feature = 23
seq_len = 256
n_heads = 32
factor = 32
num_class = 10
num_layers = 6
clf = NeuralNetworkClassifier(
SAnD(in_feature, seq_len, n_heads, factor, num_class, num_layers),
nn.CrossEntropyLoss(),
optim.Adam, optimizer_config={"lr": 1e-5, "betas": (0.9, 0.98), "eps": 4e-09, "weight_decay": 5e-4},
experiment=Experiment()
)
# training network
clf.fit(
{"train": train_loader,
"val": val_loader},
epochs=200
)
# evaluating
clf.evaluate(test_loader)
# save
clf.save_to_file("save_params/")
For the actual task, choose the appropriate hyperparameters for your model and optimizer.
There are two ways to use SAnD in a regression task.
- Specify the number of output dimensions in
num_class
. - Inherit class SAnD and overwrite
ClassificationModule
withRegressionModule
.
I would like to introduce a second point.
from SAnD.core.model import SAnD
from SAnD.core.modules import RegressionModule
class RegSAnD(SAnD):
def __init__(self, *args, **kwargs):
super(RegSAnD, self).__init__(*args, **kwargs)
d_model = kwargs.get("d_model")
factor = kwargs.get("factor")
output_size = kwargs.get("n_class") # output_size
self.clf = RegressionModule(d_model, factor, output_size)
model = RegSAnD(
input_features=..., seq_len=..., n_heads=..., factor=...,
n_class=..., n_layers=...
)
The contents of both ClassificationModule and RegressionModule are almost the same, so the 1st is recommended.
Please let me know when my code has been used to bring products or research results to the world.
It's very encouraging :)
Hirotaka Kawashima (川島 寛隆)
Copyright (c) 2019 Hirotaka Kawashima
Released under the MIT license