**Please note this routine is in the alpha testing stage, and functionality - including inputs and outputs, are expected to change (as of: 30/10/2023). **
Anomaly detection of parallel measurements with anticipated gaps and irregular sampling in the time domain. Patterns in the data are found and reconstructed as frequencies using singular value decomposition, and statistical distances are measured relative to the transformed trigonometric polynomial in the time domain.
For a monotonic increasing time series
Wihtin a matrix of complex values representing the frequencies for each time series measurement, as
Where
There are two main functions, digest_csv
and validate_data
. The plot_results
function offers a basic usage case for the sake of example.
-
file.csv
as second-order array of$M$ monotonically increasing time measurements, and$O$ stations. The first column is expected to take the time measurements, and typically the first row of the csv contains the names of the time-series variables. -
nan_maker
default=-9999 as aint
value indicating what placeholder exists to indicate missing data.
-
X
an$M \times L$ numpy array of$M$ measurements and$L$ time-series variables. Missing data is labelled asnp.nan
. -
Y
an$M \times L$ numpy array of$M$ time points that correspond to$L$ time-series variables. Missing data is indicated asnp.nan
.
X
from the output ofdigest_csv
.Y
from the output ofdigest_csv
.k
default=2 for the number of components in the SVD model.kernel
default=sobolev
the type of kernel used in the weighted non-uniform interpolative inverse Fast Fourier transform.verbose
default=True
controls the terminal output of the model.
-
Xpred
an$M \times L$ numpy array containing the interpolative function for the$o^{th}$ time-series variable. -
Xpvls
an$M \times L$ numpy array containing the p-values for the$o^{th}$ time-series variable where the data was originally measured. Missing data is indicated asnp.nan
. -
Fkr
the reconstructed$N \times L$ complex numpy array containing the calculated frequencies for the$o \in L$ time-series variables. -
X
the original data from the output.
Michael Sorochan Armstrong ([email protected]) and José Camacho Páez ([email protected]) from the Computational Data Science Lab (CoDaS) at the University of Granada. Please, note that the software is provided "as is" and we do not accept any responsibility or liability. Should you find any bug or have suggestions, please contact the authors. For copyright information, please see the license file.
In progress - please see requirements.txt
for a list of dependencies. Requires use of the intrp_infft_1d
package at https://github.com/mdarmstr/intrp_infft_1d