Introduction¶

A python Toolbox for exploratory time series data analysis

What is it for?¶

The toolbox is designed for exploratory data analysis of Ecological Momentary Assesment (EMA), Experience Sampling Methods (ESM), and Digital Phenotyping data. The toolbox enables the inspection of time series interaction and dynamics by providing methods for summary statistics, trend and periodicity evaluation, and linear and nonlinear dependency assessment. The emphasis of the analysis is on the visualizations.

The toolbox has simple user interface, having a single configuration file for filenames, paths, and variables used in the analysis.

For the toolbox testing purposes, anomymized real-life test datasets are availabe at the GitHub repository.

Main features¶

The toolbox features include:

Summary statistics
Rolling windows statistics
Time series decomposition
Similarity analysis
- Similarity plot
- Novelty score
- Stability index
Clustering

Installation¶

Pip install:

pip install tscfat

You may also clone the project.

The source code is currently hosted on GitHub at: <https://github.com/ArgonSilicon/tscfat/>

Dependencies¶

The project dependencies:

Pandas
Numpy
Matplotlib
Statsmodels
sklearn
tslearn
nolds
pytest
seaborn

Usage example¶

After cloning, make sure that pipenv is installed:

pip install pipenv

CD into tscfat root folder and activate the virtual environment:

pipenv install

tscfat/Examples folder contain a file conf.py. Open it in a text editor and replace following paths.

Fill in the correct path for output directory:

# The directory where output figures are stored
OUTPUT_DIR = Path(' ... /tscfat/Results') # <- replace with correct path!

Fill in the correct path for data loading:

# Path to the data file to be imported
CSV_PATH = Path(' ... /tscfat/Data/one_subject_data.csv') # <- replace with the correct path!

Make sure that the tscfat/Results folder contains the following subfolders:

tscfat
└── Results
    ├── Clustering
    ├── Decomposition
    ├── RollingStatistics
    ├── Similarity
    ├── Summary
    ├── Timeseries
    └── setup.py

While in tscfat root folder the example file:

pipenv run python ./Examples/example_one_subject.py

Each analysis function can be used independently. Functions assume that the input data is expected to be in a CSV file, using the following format:

Time	Y_1	Y_2	X_1	X_2	…	X_n
1472677200	3	5	56	0.1	…	0.56
1472763600	4	3	47	0.1	…	0.41
:	:	:	:	:	:	:
1478037600	4	2	99	0.2	…	0.71

The Time column contains timestamps in [unix] format.
Rest of the columns contain observations, which should be in numerical format. Each column represents one variable, rows correspond to the sampling timepoint.

For more examples and usage, please refer to the [Docs].

Release history¶

0.0.1
- Initial version, WIP

Contributing¶

Fork it (<https://github.com/ArgonSilicon/tscfat/fork>)
Create your feature branch (git checkout -b feature/fooBar)
Commit your changes (git commit -am ‘Add some fooBar’)
Push to the branch (git push origin feature/fooBar)
Create a new Pull Request