tscfat.Analysis package¶
Submodules¶
tscfat.Analysis.calculate_novelty module¶
Created on Thu Jul 2 13:54:12 2020
Function for time series novelty score calculation. Function requires a similairty matrix and calculates the novelty score with sliding windows gaussian kernel. For further reference, check: https://www.audiolabs-erlangen.de/resources/MIR/FMP/C4/C4S4_NoveltySegmentation.html
-
tscfat.Analysis.calculate_novelty.compute_novelty(simmat, edge=7, sigma=1.0, mu=0.0)[source]¶ Compute novelty score using the self similarity matrix and gaussian checkerboard convolution kernel, calculating the convolution along the self similarity matrix diagonal.
- Parameters
simmat (numpy ndarray) – N x N self similarity matrix.
edge (float, optional) – Gaussian kernel window length / 2. The default is 7.
sigma (float, optional) – Variance for the gaussian kernel construction. The default is 1.0.
mu (float, optional) – Mean for the gaussian kernel construction. The default is 0.0.
- Returns
nov (numpy ndarray) – 1D novelty score vector.
kernel (numpy ndarray) – 2D gaussian convolution kernel.
tscfat.Analysis.calculate_similarity module¶
Created on Thu Jul 2 12:28:09 2020
@author: arsi
Functions for distance matrix and similarity matrix calculation. Numpy pdist function is used for the calculation. Full reference: https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html
-
tscfat.Analysis.calculate_similarity.calculate_distance(X, metric='Euclidean')[source]¶ Calculate a similarity matrix.
- Parameters
X (Numpy ndarray) – An m by n array of m original observations in an n-dimensional space.
- metricstr or function, optional
The default is “Euclidean”-
- Returns
Y_square – Returns a condensed distance matrix Y.
- Return type
Numpy ndarray
-
tscfat.Analysis.calculate_similarity.calculate_similarity(X, metric='Euclidean')[source]¶ Calculate a distance matrix.
- Parameters
X (Numpy ndarray) – An m by n array of m original observations in an n-dimensional space.
- metricstr or function, optional
The default is “Euclidean”-
- Returns
Y_sim – Returns a similarity matrix Y.
- Return type
Numpy ndarray
tscfat.Analysis.calculate_stability module¶
Created on Fri Feb 26 09:57:53 2021
@author: arsi
Calculate a rolling window stability index value for given timeseries. Function requires a similairty matrix for the calculation. Reference for the stability index: https://www.nature.com/articles/s41537-020-00123-2
-
tscfat.Analysis.calculate_stability.compute_stability(simmat, edge=7)[source]¶ Calculate stability index for given similarity matrix, that represents a timeseries.
- Parameters
simmat (np.array) – Similarity / self-similarity matrix
edge (int, optional) – Window size used for stability calculation. The default is 7.
- Returns
stability – An array containing the rolling stability index values..
- Return type
np.array
tscfat.Analysis.cluster_timeseries module¶
Created on Thu Dec 17 12:29:57 2020
@author: arsii
Functions for time series clustering and for cluster visualization. Plot decorartor is used to handle image saving.
-
tscfat.Analysis.cluster_timeseries.cluster_timeseries(ts, FIGNAME, FIGPATH, title='Clustered timeseries', n=3, mi=5, mib=5, rs=0, metric='dtw', highlight=None, ylim_=None)[source]¶ Cluster timeseries given as an numpy array. Function uses tslearn TimeSeriesKMeans. For full reference check: https://tslearn.readthedocs.io/en/stable/gen_modules/clustering/tslearn.clustering.TimeSeriesKMeans.html
- Parameters
ts (numpy array) – A m x n matrix containing the data points
FIGNAME (str) – Figure savename
FIGPATH (path object) – Figure savepath
title (str) – Figure title
n (int, optional) – Number of clusters. The default is 3.
mi (int, optional) – Maximum number of iterations for the algorithm. The default is 5.
mib (int, optional) – N iter used for the barycenter calculation. The default is 5.
rs (int, optional) – A random state used to initialize the centers. The default is 0.
metric (str. optional) – Metric used for the cluster assigment. The default is “dtw”.
highlight (TYPE, optional) – DESCRIPTION. The default is None
ylim (tuple) – Tuple containing the y-limit values.
- Returns
labels – An array containing the assigned cluster labels.
- Return type
numpy array
tscfat.Analysis.decompose_timeseries module¶
Created on Wed Jul 1 14:40:46 2020
@author: arsi
Calculate STL decomposition for given time series and plot the components. The decomposition is based on statsmodels STL decomposition. Full reference: https://www.statsmodels.org/devel/generated/statsmodels.tsa.seasonal.STL.html
-
tscfat.Analysis.decompose_timeseries.STL_decomposition(series, title, test=False, savepath=False, savename=False, ylabel='Battery Level (%)', xlabel='Date', dates=False)[source]¶ Decompose timeseries into Model, Trend, Seasonal and Residual parts. Plot the components and their distributions. Optionally save the figure.
- Parameters
series (Numpy ndarray) – Time series to be decomposed
title (str) – Figure title.
savepath (Path object, optional) – Figure save path The default is False.
savename (str, optional) – Figure save name. The default is False.
ylabel (str, optional) – Figure ylabel. The default is “Battery Level (%)”.
xlabel (str, optional) – Figure xlabel. The default is “Date”.
dates (array, optional) – List of daytes to be highlighted in the figure. The default is False.
- Raises
Exception –
given series is not a numpy array.
- Returns
Result – Object containing the decomposition results.
- Return type
statsmodels.tsa.seasonal.DecomposeResult object
tscfat.Analysis.degree_of_distribution module¶
Created on Fri Oct 9 13:57:14 2020
@author: arsii
Calculate a distribution degree D for given timeseries. D measures the scattering of the time series values within the range of possible values. For the reference: Schiepek, Günter, and Guido Strunk. “The identification of critical fluctuations and phase transitions in short term and coarse-grained time series—a method for the real-time monitoring of human change processes.” Biological cybernetics 102.3 (2010): 197-207.
-
tscfat.Analysis.degree_of_distribution.distribution_degree(y, scale, window)[source]¶ Calculate distribution degree for given time series.
- Parameters
y (numpy array) – A Time series
scale (int) – Flutuation scale: abs(max value - min value)
window (int) – A window for calculation
- Returns
D – Calculated distribution degree.
- Return type
float
tscfat.Analysis.fluctuation_intensity module¶
Created on Thu Oct 8 13:02:27 2020
@author: arsii
Calculate a fluctuation intensity F for given timeseries. F is sensitive to amplitude and frequency changes in time signal. For the reference: Schiepek, Günter, and Guido Strunk. “The identification of critical fluctuations and phase transitions in short term and coarse-grained time series—a method for the real-time monitoring of human change processes.” Biological cybernetics 102.3 (2010): 197-207.
-
tscfat.Analysis.fluctuation_intensity.fluctuation_intensity(y, scale, window)[source]¶ Calculate fluctuation intensity for the given time series.
- Parameters
y (numpy array) – A Time series
scale (int) – Flutuation scale: abs(max value - min value)
window (int) – A window for calculation
- Returns
F – Calculated fluctuation intensity.
- Return type
float
tscfat.Analysis.plot_similarity module¶
Created on Mon Jul 6 14:07:02 2020
@author: arsi
Plot and save self similarity matrix, convolution kernel and novelty score.
-
tscfat.Analysis.plot_similarity.plot_similarity(sim, nov, stab, title='Similarity and novelty', doi=None, savepath=False, savename=False, ylim=0, 0.05, threshold=0, axis=None, kernel=False, test=False)[source]¶ Plot the similarity matrix. Optionally save the figure, plot the kernel, and plot the similarity score.
- Parameters
sim (Numpy ndarray) – m x m array containing similarity values
nov (Numpy ndarray) – m x 1 array containing novelty scores
stab (Numpy ndarray) – m x 1 array containing stability scores
doi (tuple) – (float, float) values used to highlight certain region of interest.
title (str, optional) – Similarity plot title. The default is “Similarity and novelty”.
savepath (Path object, optional) – Path for figure saving. The default is False.
savename (str object, optional) – Savename for the figure. The default is False.
ylim (tuple, optional) – (float,float) ylimits for the plot. The default is (0,0.05).
threshold (float, optional) – Similarity score threshold for showing in the plot. The default is 0.
axis (pandas.core.indexes.base.Index, optional) – Date range used in the novelty score plot. The default is False.
kernel (Numpy ndarray, optional) – m x m convolution kernel used for novelty score calculation. T he default is False.
test (boolean) – Indicates whether the function is tested by pytest. he default is False.
- Raises
Exception –
Requested save folder does not exist - Savename and/or savename are not given - Novelty score is not a numpy array - Stability score is not a numpy array
- Returns
- Return type
None.
tscfat.Analysis.plot_timeseries module¶
Created on Thu Mar 18 15:44:59 2021
@author: arsi
Function for plotting dataframe columns containing the timeseries.
-
tscfat.Analysis.plot_timeseries.plot_timeseries(data, columns, title, roll=False, xlab='Time', ylab='Value', ylim=False, savename=False, savepath=False, highlight=False, test=False)[source]¶ Plot the selected columns of the given dataframe. The dataframe index should be datetime object.
- Parameters
data (pandas dataframe) – Pandas dataframe containing the timeseries.
columns (list) – A list of strings, containing the column names.
title (str) – Figure name.
roll (int, optional) – Rolling window length. The default is False.
xlab (str, optional) – Figure x-label. The default is “Time”.
ylab (str, optional) – Figure y-label. The default is “Value”.
ylim (tuple, optional) – (float, float) ylimit for the figure. The default is False.
savename (str, optional) – Figure savename. The default is False.
savepath (path, optional) – Figure savepath. The default is False.
highlight (tuple, optional) – Tuple containing the start and end point for the region highlighting. The default is False.
test (bool, optional) – Indicates whether the function is tested by pytest. The default is False.
- Returns
fig – A figure containing the plotted timeseries.
- Return type
matplotlib figure
tscfat.Analysis.rolling_statistics module¶
Created on Fri Dec 18 13:45:10 2020
@author: arsii
Calculate rolling windows statistics for the given time series and plot them.
- The following are calculated using rolling window lenght(n):
Average
Variance
Autocorrelation
Mean square of successive differences (MSDD)
Probability of acte change (PAC)
-
tscfat.Analysis.rolling_statistics.rolling_statistics(ts, w, doi=None, savename=False, savepath=False, test=False)[source]¶ Calculate and plot several rolling statistics.
- Parameters
ts (pandas dataframe) – A dataframe containing time as index and one column of data
w (int) – Rolling statistics window size
doi (tuple) – A tuple containing tuples of dates. The default is None.
savename (str (default = False)) – Name used as plot save name. Has to be a type of str
savepath (Path -object (default = False)) – path where plot is to be saved. Path has to exist before calling this function.
test (Boolean, optional) – Flag for test function. The default is False.
- Raises
Exception –
given time series is not a pandas dataframe - given windows size is not an integer - given window length is larger than the time series length
- Returns
- Return type
None or matplotlib.pyplot figure is test if True.
tscfat.Analysis.summary_statistics module¶
Created on Mon Dec 21 13:56:55 2020
@author: ikaheia1
Calculate the following summary statistics for the given timeseries and plot the results:
Histogram
Lag plot with lag 1
Autocorrelation
Partial autocorrelation function
Autocorrelation function
-
tscfat.Analysis.summary_statistics.summary_statistics(series, title='Time series summary', window=14, savepath=False, savename=False, test=False)[source]¶ Calculate summary statistics for the give timeseries.
- Parameters
series (Pandas Series) – A time series for which the summary is calculated
title (str, optional) – Summary plot title. The default is “Time series summary”.
window (int) – Rolling window size. The default is 14.
savepath (Path object, optional) – Figure save path. The default is False.
savename (Path object, optional) – Figure save name. The default is False.
test (Boolean, optional) – Flag for test function. The default is False.
- Returns
- Return type
None.