API

API#

This page documents the classes and methods contained in the tsx package.

Models#

class tsx.models.base.NeuralNetRegressor[source]#

Regression wrapper for scikit-learn-like PyTorch training

Parameters:

module – A PyTorch model of type torch.nn.Module
random_state (optional) – Seed training with either a fixed seed or None. Defaults to None.
max_epochs (optional) – How long to train for. Defaults to 10.
device (optional) – Indicate where the model should be trained. If None, it chooses the fastest available option automatically. Defaults to None.
lr (optional) – Set learning rate. Defaults to 2e-3
batch_size (optional) – Training batch size. Defaults to 32.
verbose (optional) – Print status updates to the console. Defaults to false
callbacks (optional) – Skorch callback list used for training. Defaults to None.
**kwargs – Optional keyword arguments for skorch.NeuralNetRegressor

class tsx.models.base.NeuralNetClassifier[source]#

Classification wrapper for scikit-learn-like PyTorch training

Parameters:

module – A PyTorch model of type torch.nn.Module
random_state (optional) – Seed training with either a fixed seed or None. Defaults to None.
max_epochs (optional) – How long to train for. Defaults to 10.
device (optional) – Indicate where the model should be trained. If None, it chooses the fastest available option automatically. Defaults to None.
lr (optional) – Set learning rate. Defaults to 2e-3
batch_size (optional) – Training batch size. Defaults to 32.
verbose (optional) – Print status updates to the console. Defaults to false
callbacks (optional) – Skorch callback list used for training. Defaults to None.
**kwargs – Optional keyword arguments for skorch.NeuralNetClassifier

class tsx.models.sdt.SoftDecisionTreeClassifier[source]#

Soft Decision Tree, configured as a classifier

Parameters:

n_features – Number of input features
depth – Fixed depth of the tree

predict(X)#

Predict function

Parameters:: X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.sdt.SoftEnsembleClassifier[source]#

Ensemble of Soft Decision Trees for classification

Parameters:

n_trees – Number of ensemble member
n_features – Number of input features
depth – Fixed depth of the tree

predict(X)[source]#

Predict function

Parameters:: X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.sdt.SoftDecisionTreeRegressor[source]#

Soft Decision Tree, configured as a regressor

Parameters:

n_features – Number of input features
depth – Fixed depth of the tree

predict(X)#

Predict function

Parameters:: X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.sdt.SoftEnsembleRegressor[source]#

Ensemble of Soft Decision Trees for regression

Parameters:

n_trees – Number of ensemble member
n_features – Number of input features
depth – Fixed depth of the tree

predict(X)[source]#

Predict function

Parameters:: X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.forecaster.ospgsm.OS_PGSM(pool, L, context_size, detect_concept_drift=True, threshold=0.5, min_roc_size=2, random_state=0)[source]#

Improved version of the OS-PGSM algorithm originally presented in Saadallah, A., Jakobs, M., Morik, K. (2021). Explainable Online Deep Neural Network Selection Using Adaptive Saliency Maps for Time Series Forecasting. In: Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2021. https://doi.org/10.1007/978-3-030-86486-6_25

Parameters:

pool (List of nn.Module) – List of pretrained neural network models. Each model must contain a feature_extractor and forecaster submodule for OS-PGSM to work.
L (int) – The amount of lag used to train the pool models
context_size (int) – Size of the chunks from which Regions of Competence are created
detect_concept_drift (bool) – Whether or not to enable concept drift detection
threshold (float) – Minimum value indicating when a step is salient enough to be part of the Region of Competence
min_roc_size (int) – RoC member smaller than this value are not added to the RoC
random_state (int) – Random state seeding the run method

run(X_val, X_test)[source]#

Main method for running the prediction

Parameters:

X_val (torch.Tensor) – Univariate validation time series from which the Regions of Competence are created
X_test (torch.Tensor) – Univariate test time series which should be forecasted

Returns:

Prediction tensor with the same length as X_test

class tsx.models.forecaster.ospgsm.OEP_ROC(pool, L, context_size, context_step, nr_clusters_ensemble=15, dist_fn=<function euclidean>, detect_concept_drift=True, threshold=0.1, min_roc_size=2, random_state=0)[source]#

Improved version of the OEP-ROC algorithm originally presented in Saadallah, A., Jakobs, M. & Morik, K. Explainable online ensemble of deep neural network pruning for time series forecasting. Mach Learn 111, 3459–3487 (2022)

Parameters:

pool (List of nn.Module) – List of pretrained neural network models. Each model must contain a feature_extractor and forecaster submodule for OS-PGSM to work.
L (int) – The amount of lag used to train the pool models
context_size (int) – Size of the chunks from which Regions of Competence are created
context_step (int) – Step size between the context_size chunks
nr_clusters_ensemble (int) – How many cluster centers to use
dist_fn (callable) – A distance function defined on two time series windows
detect_concept_drift (bool) – Whether or not to enable concept drift detection
threshold (float) – Minimum value indicating when a step is salient enough to be part of the Region of Competence
min_roc_size (int) – RoC member smaller than this value are not added to the RoC
random_state (int) – Random state seeding the run method

run(X_val, X_test)[source]#

Main method for running the prediction

Parameters:

X_val (torch.Tensor) – Univariate validation time series from which the Regions of Competence are created
X_test (torch.Tensor) – Univariate test time series which should be forecasted

Returns:

Prediction tensor with the same length as X_test

Datasets#

Monash Forecasting Repository#

tsx.datasets.monash.possible_datasets()[source]#: Returns list of possible dataset names

tsx.datasets.monash.load_monash(dataset: str, return_pytorch: bool = False, return_numpy: bool = False, return_horizon: bool = False)[source]#

Loads datasets from Monash Time Series Forecasting Repository.

Parameters:

dataset – Name of the dataset to be downloaded. Consists of the name of the dataset as well as the “version” of the dataset separated by an underscore.
return_horizon – Datasets have a specific forecast horizon. True if they should be returned as well.
return_pytorch – Returns dataset as a PyTorch tensor. Throws error if not possible.
return_numpy – Returns dataset as a numpy array. Throws error if not possible.

Jena Climate Dataset#

tsx.datasets.jena.load_jena(full_features: bool = False, resample: str = '60T', return_numpy: bool = False, return_pytorch: bool = False)[source]#

Returns the Jena Climate 2009 - 2016 dataset

Parameters:

full_feature – return all features (true) or selection of informative features
resample – string in pandas resample notation
return_numpy – returns dataset as a numpy array
return_pytorch – returns dataset as a pytorch tensor

Utilities#

tsx.datasets.utils.windowing(x: numpy.ndarray | torch.Tensor, L: int, z: int = 1, H: int = 1, use_torch: bool = False)[source]#

Create sliding windows from input x

Parameters:

x – Input time series
L – Amount of lag to use
H – Forecast horizon
z – Step length
use_torch – Whether to return np.ndarray or torch.Tensor

Returns:

Windowed X and y, either as a Numpy array or PyTorch tensor

tsx.datasets.utils.split_horizon(x: numpy.ndarray | torch.Tensor, H: int, L: None | int = None)[source]#

Split a time series into two parts, given a forecasting horizon

Parameters:

x – Input time series
H – Forecast horizon
L (optional) – Amount of lag to use

Returns:

Two arrays (type depends on the type of x), the first one corresponding to everything before H

tsx.datasets.utils.split_proportion(X, proportions)[source]#

Split a time series into |proportion| pieces, given the fractions in proportion

Parameters:

X – Input time series
proportions – List of fractions for each split. Must sum up to one and be of at least size 2

Returns:

List of splits of X

Model selection and ensembling#

class tsx.model_selection.ROC_Member(x, y, indices, squared_error)[source]#

Object representing a member of a Region of Competence

Parameters:

x (np.ndarray) – Original time series values
y (np.ndarray) – Corresponding true forecasting values
indices (np.ndarray) – Indices indicating the salient region
squared_error (float) – Squared error achieved by the model

Attributes:

r (np.ndarray) – Most salient subseries of x
x (np.ndarray) – Original time series values
y (np.ndarray) – Corresponding true forecasting values
indices (np.ndarray) – Indices indicating the salient region

dtw_distance(x)[source]#

Return DTW distances of self to x

Parameters:: x – Input array to compare against
Returns:: A list of DTW distances between salient parts of self.x and x

euclidean_distance(x)[source]#

Return euclidean distances of self to x

Parameters:: x – Input array to compare against
Returns:: A list of euclidean distances between salient parts of self.x and x

tsx.model_selection.roc_tools.find_best_forecaster(x: torch.Tensor, rocs: List[List[ROC_Member]], pool: List[torch.nn.Module], dist_fn: callable, topm: int = 1)[source]#

Given an input x, a pool of pretrained models pool and corresponding RoCs rocs return the topm best forecasters according to distance measure dist_fn

Parameters:

x – Input time series window
pool – List of pretrained models
rocs – List of Regions of Competences
dist_fn – Distance function applicable for two time series windows
topm – How many models to return

Returns:

A np.ndarray of the topm best model indices from pool

tsx.model_selection.roc_tools.find_closest_rocs(x: torch.Tensor, rocs: List[List[torch.Tensor]], dist_fn: callable)[source]#

Given an input x and RoCs rocs return the closest RoC member for each model w.r.t. dist_fn

Parameters:

x – Input time series window
rocs – List of Regions of Competences
dist_fn – Distance function applicable for two time series windows

Returns:

A list of model indices and a list of correpsonding closest ROC_Member objects

class tsx.model_selection.ADE(random_state=None)[source]#

Reimplementation of ADE from from https://link.springer.com/article/10.1007/s10994-018-05774-y

Parameters:: random_state – Input to to_random_state

run(X_train, y_train, train_preds, X_test, y_test, test_preds, _omega=0.5, _lambda=50, only_best=False)[source]#

Compute model selection and prediction

Parameters:

X_train – Input for training meta learners
y_train – Label for training meta learners
train_preds – shape (n_learner, T_train) predictions on training data for each model X_test: Test input data
X_test – Test inputs
y_test – Test labels
test_preds – shape (n_learner, T_test) predictions on test data for each model
_omega – Committee ratio
_lambda – Window size (how much old data timesteps to include for penalty)
only_best – If True, return only best model. Otherwise, return ensemble weights (default: False)

Returns:

Tuple of predictions and weights. weights is a list of indices if only_best==True

class tsx.model_selection.DETS[source]#

Reimplementation of DETS from from https://ieeexplore.ieee.org/abstract/document/8259783?casa_token=yA69YjHH3OEAAAAA:KSJg6CPyOOOC2KkbypuUA0BEPjuUNsqcgHVHDCM3sxHH4p0jMfnq8Ev1-JYGEHy56x7CI1gCZQ

run(X_train, y_train, train_preds, X_test, y_test, test_preds, P=50, _lambda=0.5, only_best=False)[source]#

Compute model selection and prediction

Parameters:

X_train – Input for training meta learners
y_train – Label for training meta learners
train_preds – shape (n_learner, T_train) predictions on training data for each model X_test: Test input data
X_test – Test input
y_test – Test labels
test_preds – shape (n_learner, T_test) predictions on test data for each model
_lambda – Committee ratio
P – Window size (how much old data timesteps to include for penalty)
only_best – If True, return only best model. Otherwise, return ensemble weights (default: False)

Returns:

Tuple of predictions and weights. weights is a list of indices if only_best==True

class tsx.model_selection.KNNRoC(pool, random_state=None)[source]#

Train KNN classifier based on Regions of Competence

Parameters:

pool – Pool of pretrained models to do forecasting
random_state – Valid input to to_random_state

run(x_val, y_val, x_test)[source]#

Compute model selection and prediction

Parameters:

x_val – Input for training KNN
y_val – Label for training KNN
x_test – Input to forecast

Returns:

Tuple of predictions and selection

class tsx.model_selection.OMS_ROC(pool, random_state=None)[source]#

RoC-based model-agnostic selection method utilizing K-Means clustering of validation data to build Regions of Competence

Parameters:

pool – Pool of pretrained models to do forecasting
random_state – Valid input to to_random_state

run(x_val, y_val, x_test)[source]#

Compute model selection and prediction

Parameters:

x_val – Input for training KNN
y_val – Label for training KNN
x_test – Input to forecast

Returns:

Tuple of predictions and selection

Concepts#

Base functions#

tsx.concepts.n_uniques(A, L)[source]#

Get the number of unique scale-invariant time series concepts of length L and alphabet size A

Parameters:

A – Alphabet size
L – Length of time series

Returns:

Number of unique scale-invariant concepts

tsx.concepts.generate_unique_concepts(L, A)[source]#

Generate all unique scale-invariant time series concepts as string representation given L and A

Parameters:

A – Alphabet size
L – Length of time series

Returns:

List of unique scale-invariant concepts as string representations

tsx.concepts.generate_all_concepts(L, A)[source]#

Generate all time series concepts as string representation given L and A

Parameters:

A – Alphabet size
L – Length of time series

Returns:

List of all possible concepts as string representations

tsx.concepts.generate_samples(concept_key, size, A, random_state=None)[source]#

Generate size random samples for given concept concept_key

Parameters:

concept_key – String representation of desired concept
size – Number of desired samples
A – Alphabet size
random_state – Valid input to to_random_state

Returns:

Numpy array of size size of samples from concept concept_key

tsx.concepts.find_closest_concepts(X, concepts)[source]#

Find closest concepts in concepts for each time series in X

Parameters:

X – 2d numpy array of time series datapoints
concepts – 2d numpy array of concept samples

Returns:

List of indices into concepts of size len(X), indicating closest concept for each entry

tsx.concepts.get_concept_distributions(X, concepts, normalize=True)[source]#

Calculate empricial distribution, given concepts, over X

Parameters:

X – 2d numpy array of time series datapoints
concepts – 2d numpy array of concept samples
normalize – Whether or not the distribution should be normalized or should encode total counts

Returns:

List of indices into concepts of size len(X), indicating closest concept for each entry

TCAV#

tsx.concepts.get_cavs(model, concepts, A, size_per_concept=20, return_lms=False, verbose=False, random_state=None)[source]#

Get Class Activation Vectors of concepts for model

Parameters:

model – Pytorch model with method get_activation, returning a latent representation
concepts – 2d numpy array of concept samples
A – Alphabet size
size_per_concept – How many samples per concept to generate for training linear models
return_lms – If True, return list of sklearn.linear_model.LogisticRegression instead of just CAVs (default: False)
verbose – If True, print linear model F1 score per concept (default: False)
random_state – Input to to_random_state

Returns:

Either a list of Class Activation Vectors or a list of sklearn.linear_model.LogisticRegression models

tsx.concepts.get_tcav(model, cavs, X, y=None, aggregate='tcav_original')[source]#

Get TCAV values of cavs for model

Parameters:

model – Pytorch model with method get_activation, returning a latent representation and model.predictor returning a point forecast
cavs – List of Class Activation Vectors from get_cavs
X – 2D numpy array of input data
y – If not None, return TCAV values for squared error (default: None)
aggregate – How to aggregate TCAV values. Possible values: [tcav_original, none] (default: tcav_original, which counts number of positive TCAV values)

Returns:

(Aggregated) TCAV scores

Distances#

tsx.distances.dtw(s: numpy.ndarray | torch.Tensor, t: numpy.ndarray | torch.Tensor)[source]#

Dynamic Time Warping from fastdtw package

Parameters:

s – First input
t – Second input

Returns:

Calculated distance (float)

tsx.distances.euclidean(s: numpy.ndarray | torch.Tensor, t: numpy.ndarray | torch.Tensor)[source]#

Parameters:

s – First input
t – Second input

Returns:

Calculated distance (float)

Metrics#

tsx.metrics.mase(y_pred, y_true, X)[source]#

Compute MASE value

Parameters:

y_pred – Predicted values
y_true – True values
X – Background time series to compute one-step-ahead repeated forecasts

Returns:

MASE value that is 1 if forecast is equal to repeated value baseline, <1 if better. Always >= 0.

tsx.metrics.entropy(P, scale=True)[source]#

Compute (scaled) entropy

Parameters:

P – Input to entropy
scale – If True, scale entropy to be in [0,1] (default: True)

Returns:

(Scaled) entropy

Utilities#

tsx.utils.to_random_state(rs: int | None | numpy.random.Generator)[source]#

Return np.random.Generator object from input

Parameters:: rs – Something that np.random.default_rng can process.
Returns:: A np.random.default_rng object

tsx.utils.get_device()[source]#

Return the “best” device in the following order:: If a GPU is available, return “cuda”. If Metal is available, return “mps” Otherwise, return “cpu”

Returns:: String indicating best possible device for Torch Tensors

API

Contents

API#

Models#

Datasets#

Monash Forecasting Repository#

Jena Climate Dataset#

Utilities#

Model selection and ensembling#

Concepts#

Base functions#

TCAV#

Distances#

Metrics#

Utilities#