TimeSeries

Imputation

Temporal dependence characterizes time series data because observations close in time tend to be similar compared to cross-sectional data.

Missing mechanisms Rubin, 1976

MCAR
MAR
MNAR

Missing Patterns

point missing
subsequence missing
block missing

Statistical methods of imputation

deletion
constant imputation
locf - last obs carried forward
nocb - next obs carried backward
mean/median/mode
rolling statistics
linear interpolation
spline interpolation
KNN
regression
seasonal trend decomposition using Loess (STL)

Papers

TimesNet ---> Github - TSLib

TSLib/TimesNet only supports point-missing pattern. They randomly mask the time-points in the ratio of {12.5%, 25%, 37.5%, 50%}.

Results for Autoformer (Weather dataset)

Mask Rate MSE MAE

12.5% 0.3128 0.4111

25% 0.3024 0.3879

37.5% 0.1488 0.2562

50% 0.1428 0.2470

More masking → model sees less observed data but is forced to learn deeper temporal dependencies and structural patterns.
This leads to more robust representations, like how dropout improves generalization by preventing over-reliance on specific inputs.

Results for Timesnet (Weather dataset)

Masking Ratio MAE MSE

0.125 0.04593 0.02517

0.25 0.05506 0.02932

0.375 0.05704 0.03088

0.5 0.06148 0.03413

Timesnet performance decreases with the increase in masking ratio, but Autoformer performance increases.
Although, Timesnet performs much better at TS-imputation task. it also tops the TSlib leaderboard for this task.

Extension of this paper - Deep TS Models
MTSI with Transformers

This paper uses 2 datasets -
1. physionet 2012 (clinical dataset) - As the dataset has no ground truth, 10/50/90% of the observed values in the test data are taken as ground truths for which the input data is masked with the Bernoulli distribution.
2. beijing air quality - uses block-missing pattern. There is already 13% missing data. For each missing data-point, the succeeding month's data-point is taken as the ground truth. For example, if 24th Feb is missing, the ground truth to this is 24th March.
TSI-Bench ---> Github - Awesome Imputation
TSIBench supports all three missing patterns - point, subseq and block.
DL for MTSI ----> Pypots imputations

Classification

Papers

Transformers in TS - IJCAI
DL for TSC - mlp, cnn, rnn/esn, fcn, resnet, encoder, mcnn, t-LeNet, mcdcnn, time-cnn
Voice2Series - ICML - Achieves SOTA in 19 datasets ($xt=Pad(xt)+δ$) - Github
Padding reprogramming — where the padded portion is replaced by a trainable additive vector $δ=M⊙θ$
Aeon library

Discriminative region

A discriminative region is the subsequence of a time series that contains the most informative features for classifying the time series into the correct class.

Method	How it finds the discriminative region
Shapelets	Learns short subsequences that best separate classes (e.g., slant)
Saliency/Grad-CAM	Highlights time points where gradients w.r.t. output are strongest
Attention models	Learn to focus on regions (middle slant) with highest class-relevance
Class activation maps	Show which part of input most influences the predicted class
Manual inspection	Plotting and observing differences (used in early literature)

Anomaly Detection

Training loop often doesn't have labels.
The model is trained to re-construct normal (non-anomalous) data.
For normal samples, reconstruction error should be low.
For anomalous samples, reconstruction error should be high (since it is unseen data. Model predicted normal data, but original data has an anomaly)

Evaluation

Calculate reconstruction error(MSELoss) between pred and output - score = torch.mean(self.anomaly_criterion(batch_x, outputs), dim=-1)
Concat all errors into a single array
Find the threshold percentile (Any test sample with a reconstruction error above this threshold (the top 1% highest errors) is flagged as an anomaly.) - threshold = np.percentile(combined_energy, 100 - self.args.anomaly_ratio)
Filter all predictions by checking which are more than threshold - pred = (test_energy > threshold).astype(int)

Anomaly Patterns

Point anomalies (point-based) refer to data points that deviate remarkably from the rest of the data.
Contextual anomalies (point-based) refer to data points within the expected range of the distribution (in contrast to point anomalies) but deviate from the expected data distribution, given a specific context (e.g., a window).
Collective anomalies (sequence-based) refer to sequences of points that do not repeat a typical (previously observed) pattern.

Papers

Dive into TS AD - describes many methods
AnomalyBert - ICLR - Github - processes time series in patches (small groups of points). Unlike the original Transformer or ViT, we do not use sinusoidal positional encodings or absolute position embeddings to inject positional information. We instead add 1D relative position bias to each attention matrix to consider the relative positions between features in a window.

Forecasting

training

Aspect	Short-Term	Long-Term	Difference
Time features	❌ Not used	✅ Uses `batch_x_mark` and `batch_y_mark`	Short-term often uses raw ts only
Model Call	`self.model(batch_x, None, dec_inp, None)`	`self.model(batch_x, batch_x_mark, dec_inp, batch_y_mark)`	Long-term uses full context
Loss Calculation	`criterion(batch_x, freq_map, outputs, batch_y, batch_y_mark)` + optional sharpness loss	`criterion(outputs, batch_y)`	Short-term loss may include frequency/temporal sharpness terms
Sharpness Regularization	✅ `MSE(output diffs, target diffs)` — optional	❌ Not applied	Unique to short-term variant
Use of Frequency Map	✅ Passed to loss (for frequency-aware loss function)	❌ Not used in long-term training	Short-term focuses on frequency

validation

Feature	Short-Term	Long-Term
# of test samples	1 (last training slice)	Many (rolling across test set)
Loop over batches	❌	✅
Decoder input	Single sample	Reconstructed per batch
Time marks used	❌	✅
Inverse scaling	Optional, less common	Common in scaled datasets
Evaluation metrics	Often skipped	Full set + DTW

Statistical methods for forecasting Paper

Simple Exponential Smoothing
Holt's method (Double Exponential Smoothing)
Holt-Winter's method (Triple Exponential Smoothing)
Holt-Winter's Method with Multiplicative Seasonality
Holt-Winter's Method with Additive Seasonality

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
CnDiff		CnDiff
Other_Models		Other_Models
data_provider		data_provider
dataset		dataset
exp		exp
images		images
layers		layers
models		models
paper		paper
rational_kat_cu		rational_kat_cu
results		results
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
ltf_baseline0.sh		ltf_baseline0.sh
ltf_baseline1.sh		ltf_baseline1.sh
req.txt		req.txt
requirements.txt		requirements.txt
reviews.md		reviews.md
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TimeSeries

Imputation

Missing mechanisms Rubin, 1976

Missing Patterns

Statistical methods of imputation

Papers

Classification

Papers

Discriminative region

Anomaly Detection

Evaluation

Anomaly Patterns

Papers

Forecasting

Statistical methods for forecasting Paper

Papers

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Mask Rate	MSE	MAE
12.5%	0.3128	0.4111
25%	0.3024	0.3879
37.5%	0.1488	0.2562
50%	0.1428	0.2470

Masking Ratio	MAE	MSE
0.125	0.04593	0.02517
0.25	0.05506	0.02932
0.375	0.05704	0.03088
0.5	0.06148	0.03413

omkar-334/newTSlib

Folders and files

Latest commit

History

Repository files navigation

TimeSeries

Imputation

Missing mechanisms Rubin, 1976

Missing Patterns

Statistical methods of imputation

Papers

Classification

Papers

Discriminative region

Anomaly Detection

Evaluation

Anomaly Patterns

Papers

Forecasting

Statistical methods for forecasting Paper

Papers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages