API#

This page documents the classes and methods contained in the tsx package.

Models#

class tsx.models.base.NeuralNetRegressor[source]#

Regression wrapper for scikit-learn-like PyTorch training

Parameters:
  • module – A PyTorch model of type torch.nn.Module

  • random_state (optional) – Seed training with either a fixed seed or None. Defaults to None.

  • max_epochs (optional) – How long to train for. Defaults to 10.

  • device (optional) – Indicate where the model should be trained. If None, it chooses the fastest available option automatically. Defaults to None.

  • lr (optional) – Set learning rate. Defaults to 2e-3

  • batch_size (optional) – Training batch size. Defaults to 32.

  • verbose (optional) – Print status updates to the console. Defaults to false

  • callbacks (optional) – Skorch callback list used for training. Defaults to None.

  • **kwargs – Optional keyword arguments for skorch.NeuralNetRegressor

class tsx.models.base.NeuralNetClassifier[source]#

Classification wrapper for scikit-learn-like PyTorch training

Parameters:
  • module – A PyTorch model of type torch.nn.Module

  • random_state (optional) – Seed training with either a fixed seed or None. Defaults to None.

  • max_epochs (optional) – How long to train for. Defaults to 10.

  • device (optional) – Indicate where the model should be trained. If None, it chooses the fastest available option automatically. Defaults to None.

  • lr (optional) – Set learning rate. Defaults to 2e-3

  • batch_size (optional) – Training batch size. Defaults to 32.

  • verbose (optional) – Print status updates to the console. Defaults to false

  • callbacks (optional) – Skorch callback list used for training. Defaults to None.

  • **kwargs – Optional keyword arguments for skorch.NeuralNetClassifier

class tsx.models.sdt.SoftDecisionTreeClassifier[source]#

Soft Decision Tree, configured as a classifier

Parameters:
  • n_features – Number of input features

  • depth – Fixed depth of the tree

predict(X)#

Predict function

Parameters:

X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.sdt.SoftEnsembleClassifier[source]#

Ensemble of Soft Decision Trees for classification

Parameters:
  • n_trees – Number of ensemble member

  • n_features – Number of input features

  • depth – Fixed depth of the tree

predict(X)[source]#

Predict function

Parameters:

X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.sdt.SoftDecisionTreeRegressor[source]#

Soft Decision Tree, configured as a regressor

Parameters:
  • n_features – Number of input features

  • depth – Fixed depth of the tree

predict(X)#

Predict function

Parameters:

X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.sdt.SoftEnsembleRegressor[source]#

Ensemble of Soft Decision Trees for regression

Parameters:
  • n_trees – Number of ensemble member

  • n_features – Number of input features

  • depth – Fixed depth of the tree

predict(X)[source]#

Predict function

Parameters:

X (torch.Tensor) – Input tensor of size (batch_size, n_features)

class tsx.models.forecaster.ospgsm.OS_PGSM(pool, L, context_size, detect_concept_drift=True, threshold=0.5, min_roc_size=2, random_state=0)[source]#

Improved version of the OS-PGSM algorithm originally presented in Saadallah, A., Jakobs, M., Morik, K. (2021). Explainable Online Deep Neural Network Selection Using Adaptive Saliency Maps for Time Series Forecasting. In: Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2021. https://doi.org/10.1007/978-3-030-86486-6_25

Parameters:
  • pool (List of nn.Module) – List of pretrained neural network models. Each model must contain a feature_extractor and forecaster submodule for OS-PGSM to work.

  • L (int) – The amount of lag used to train the pool models

  • context_size (int) – Size of the chunks from which Regions of Competence are created

  • detect_concept_drift (bool) – Whether or not to enable concept drift detection

  • threshold (float) – Minimum value indicating when a step is salient enough to be part of the Region of Competence

  • min_roc_size (int) – RoC member smaller than this value are not added to the RoC

  • random_state (int) – Random state seeding the run method

run(X_val, X_test)[source]#

Main method for running the prediction

Parameters:
  • X_val (torch.Tensor) – Univariate validation time series from which the Regions of Competence are created

  • X_test (torch.Tensor) – Univariate test time series which should be forecasted

Returns:

Prediction tensor with the same length as X_test

class tsx.models.forecaster.ospgsm.OEP_ROC(pool, L, context_size, context_step, nr_clusters_ensemble=15, dist_fn=<function euclidean>, detect_concept_drift=True, threshold=0.1, min_roc_size=2, random_state=0)[source]#

Improved version of the OEP-ROC algorithm originally presented in Saadallah, A., Jakobs, M. & Morik, K. Explainable online ensemble of deep neural network pruning for time series forecasting. Mach Learn 111, 3459–3487 (2022)

Parameters:
  • pool (List of nn.Module) – List of pretrained neural network models. Each model must contain a feature_extractor and forecaster submodule for OS-PGSM to work.

  • L (int) – The amount of lag used to train the pool models

  • context_size (int) – Size of the chunks from which Regions of Competence are created

  • context_step (int) – Step size between the context_size chunks

  • nr_clusters_ensemble (int) – How many cluster centers to use

  • dist_fn (callable) – A distance function defined on two time series windows

  • detect_concept_drift (bool) – Whether or not to enable concept drift detection

  • threshold (float) – Minimum value indicating when a step is salient enough to be part of the Region of Competence

  • min_roc_size (int) – RoC member smaller than this value are not added to the RoC

  • random_state (int) – Random state seeding the run method

run(X_val, X_test)[source]#

Main method for running the prediction

Parameters:
  • X_val (torch.Tensor) – Univariate validation time series from which the Regions of Competence are created

  • X_test (torch.Tensor) – Univariate test time series which should be forecasted

Returns:

Prediction tensor with the same length as X_test

Datasets#

Monash Forecasting Repository#

tsx.datasets.monash.possible_datasets()[source]#

Returns list of possible dataset names

tsx.datasets.monash.load_monash(dataset: str, return_pytorch: bool = False, return_numpy: bool = False, return_horizon: bool = False)[source]#

Loads datasets from Monash Time Series Forecasting Repository.

Parameters:
  • dataset – Name of the dataset to be downloaded. Consists of the name of the dataset as well as the “version” of the dataset separated by an underscore.

  • return_horizon – Datasets have a specific forecast horizon. True if they should be returned as well.

  • return_pytorch – Returns dataset as a PyTorch tensor. Throws error if not possible.

  • return_numpy – Returns dataset as a numpy array. Throws error if not possible.

Jena Climate Dataset#

tsx.datasets.jena.load_jena(full_features: bool = False, resample: str = '60T', return_numpy: bool = False, return_pytorch: bool = False)[source]#

Returns the Jena Climate 2009 - 2016 dataset

Parameters:
  • full_feature – return all features (true) or selection of informative features

  • resample – string in pandas resample notation

  • return_numpy – returns dataset as a numpy array

  • return_pytorch – returns dataset as a pytorch tensor

Utilities#

tsx.datasets.utils.windowing(x: numpy.ndarray | torch.Tensor, L: int, z: int = 1, H: int = 1, use_torch: bool = False)[source]#

Create sliding windows from input x

Parameters:
  • x – Input time series

  • L – Amount of lag to use

  • H – Forecast horizon

  • z – Step length

  • use_torch – Whether to return np.ndarray or torch.Tensor

Returns:

Windowed X and y, either as a Numpy array or PyTorch tensor

tsx.datasets.utils.split_horizon(x: numpy.ndarray | torch.Tensor, H: int, L: None | int = None)[source]#

Split a time series into two parts, given a forecasting horizon

Parameters:
  • x – Input time series

  • H – Forecast horizon

  • L (optional) – Amount of lag to use

Returns:

Two arrays (type depends on the type of x), the first one corresponding to everything before H

tsx.datasets.utils.split_proportion(X, proportions)[source]#

Split a time series into |proportion| pieces, given the fractions in proportion

Parameters:
  • X – Input time series

  • proportions – List of fractions for each split. Must sum up to one and be of at least size 2

Returns:

List of splits of X

Model selection and ensembling#

class tsx.model_selection.ROC_Member(x, y, indices, squared_error)[source]#

Object representing a member of a Region of Competence

Parameters:
  • x (np.ndarray) – Original time series values

  • y (np.ndarray) – Corresponding true forecasting values

  • indices (np.ndarray) – Indices indicating the salient region

  • squared_error (float) – Squared error achieved by the model

Attributes:
  • r (np.ndarray) – Most salient subseries of x

  • x (np.ndarray) – Original time series values

  • y (np.ndarray) – Corresponding true forecasting values

  • indices (np.ndarray) – Indices indicating the salient region

dtw_distance(x)[source]#

Return DTW distances of self to x

Parameters:

x – Input array to compare against

Returns:

A list of DTW distances between salient parts of self.x and x

euclidean_distance(x)[source]#

Return euclidean distances of self to x

Parameters:

x – Input array to compare against

Returns:

A list of euclidean distances between salient parts of self.x and x

tsx.model_selection.roc_tools.find_best_forecaster(x: torch.Tensor, rocs: List[List[ROC_Member]], pool: List[torch.nn.Module], dist_fn: callable, topm: int = 1)[source]#

Given an input x, a pool of pretrained models pool and corresponding RoCs rocs return the topm best forecasters according to distance measure dist_fn

Parameters:
  • x – Input time series window

  • pool – List of pretrained models

  • rocs – List of Regions of Competences

  • dist_fn – Distance function applicable for two time series windows

  • topm – How many models to return

Returns:

A np.ndarray of the topm best model indices from pool

tsx.model_selection.roc_tools.find_closest_rocs(x: torch.Tensor, rocs: List[List[torch.Tensor]], dist_fn: callable)[source]#

Given an input x and RoCs rocs return the closest RoC member for each model w.r.t. dist_fn

Parameters:
  • x – Input time series window

  • rocs – List of Regions of Competences

  • dist_fn – Distance function applicable for two time series windows

Returns:

A list of model indices and a list of correpsonding closest ROC_Member objects

class tsx.model_selection.ADE(random_state=None)[source]#

Reimplementation of ADE from from https://link.springer.com/article/10.1007/s10994-018-05774-y

Parameters:

random_state – Input to to_random_state

run(X_train, y_train, train_preds, X_test, y_test, test_preds, _omega=0.5, _lambda=50, only_best=False)[source]#

Compute model selection and prediction

Parameters:
  • X_train – Input for training meta learners

  • y_train – Label for training meta learners

  • train_preds – shape (n_learner, T_train) predictions on training data for each model X_test: Test input data

  • X_test – Test inputs

  • y_test – Test labels

  • test_preds – shape (n_learner, T_test) predictions on test data for each model

  • _omega – Committee ratio

  • _lambda – Window size (how much old data timesteps to include for penalty)

  • only_best – If True, return only best model. Otherwise, return ensemble weights (default: False)

Returns:

Tuple of predictions and weights. weights is a list of indices if only_best==True

class tsx.model_selection.DETS[source]#

Reimplementation of DETS from from https://ieeexplore.ieee.org/abstract/document/8259783?casa_token=yA69YjHH3OEAAAAA:KSJg6CPyOOOC2KkbypuUA0BEPjuUNsqcgHVHDCM3sxHH4p0jMfnq8Ev1-JYGEHy56x7CI1gCZQ

run(X_train, y_train, train_preds, X_test, y_test, test_preds, P=50, _lambda=0.5, only_best=False)[source]#

Compute model selection and prediction

Parameters:
  • X_train – Input for training meta learners

  • y_train – Label for training meta learners

  • train_preds – shape (n_learner, T_train) predictions on training data for each model X_test: Test input data

  • X_test – Test input

  • y_test – Test labels

  • test_preds – shape (n_learner, T_test) predictions on test data for each model

  • _lambda – Committee ratio

  • P – Window size (how much old data timesteps to include for penalty)

  • only_best – If True, return only best model. Otherwise, return ensemble weights (default: False)

Returns:

Tuple of predictions and weights. weights is a list of indices if only_best==True

class tsx.model_selection.KNNRoC(pool, random_state=None)[source]#

Train KNN classifier based on Regions of Competence

Parameters:
  • pool – Pool of pretrained models to do forecasting

  • random_state – Valid input to to_random_state

run(x_val, y_val, x_test)[source]#

Compute model selection and prediction

Parameters:
  • x_val – Input for training KNN

  • y_val – Label for training KNN

  • x_test – Input to forecast

Returns:

Tuple of predictions and selection

class tsx.model_selection.OMS_ROC(pool, random_state=None)[source]#

RoC-based model-agnostic selection method utilizing K-Means clustering of validation data to build Regions of Competence

Parameters:
  • pool – Pool of pretrained models to do forecasting

  • random_state – Valid input to to_random_state

run(x_val, y_val, x_test)[source]#

Compute model selection and prediction

Parameters:
  • x_val – Input for training KNN

  • y_val – Label for training KNN

  • x_test – Input to forecast

Returns:

Tuple of predictions and selection

Concepts#

Base functions#

tsx.concepts.n_uniques(A, L)[source]#

Get the number of unique scale-invariant time series concepts of length L and alphabet size A

Parameters:
  • A – Alphabet size

  • L – Length of time series

Returns:

Number of unique scale-invariant concepts

tsx.concepts.generate_unique_concepts(L, A)[source]#

Generate all unique scale-invariant time series concepts as string representation given L and A

Parameters:
  • A – Alphabet size

  • L – Length of time series

Returns:

List of unique scale-invariant concepts as string representations

tsx.concepts.generate_all_concepts(L, A)[source]#

Generate all time series concepts as string representation given L and A

Parameters:
  • A – Alphabet size

  • L – Length of time series

Returns:

List of all possible concepts as string representations

tsx.concepts.generate_samples(concept_key, size, A, random_state=None)[source]#

Generate size random samples for given concept concept_key

Parameters:
  • concept_key – String representation of desired concept

  • size – Number of desired samples

  • A – Alphabet size

  • random_state – Valid input to to_random_state

Returns:

Numpy array of size size of samples from concept concept_key

tsx.concepts.find_closest_concepts(X, concepts)[source]#

Find closest concepts in concepts for each time series in X

Parameters:
  • X – 2d numpy array of time series datapoints

  • concepts – 2d numpy array of concept samples

Returns:

List of indices into concepts of size len(X), indicating closest concept for each entry

tsx.concepts.get_concept_distributions(X, concepts, normalize=True)[source]#

Calculate empricial distribution, given concepts, over X

Parameters:
  • X – 2d numpy array of time series datapoints

  • concepts – 2d numpy array of concept samples

  • normalize – Whether or not the distribution should be normalized or should encode total counts

Returns:

List of indices into concepts of size len(X), indicating closest concept for each entry

TCAV#

tsx.concepts.get_cavs(model, concepts, A, size_per_concept=20, return_lms=False, verbose=False, random_state=None)[source]#

Get Class Activation Vectors of concepts for model

Parameters:
  • model – Pytorch model with method get_activation, returning a latent representation

  • concepts – 2d numpy array of concept samples

  • A – Alphabet size

  • size_per_concept – How many samples per concept to generate for training linear models

  • return_lms – If True, return list of sklearn.linear_model.LogisticRegression instead of just CAVs (default: False)

  • verbose – If True, print linear model F1 score per concept (default: False)

  • random_state – Input to to_random_state

Returns:

Either a list of Class Activation Vectors or a list of sklearn.linear_model.LogisticRegression models

tsx.concepts.get_tcav(model, cavs, X, y=None, aggregate='tcav_original')[source]#

Get TCAV values of cavs for model

Parameters:
  • model – Pytorch model with method get_activation, returning a latent representation and model.predictor returning a point forecast

  • cavs – List of Class Activation Vectors from get_cavs

  • X – 2D numpy array of input data

  • y – If not None, return TCAV values for squared error (default: None)

  • aggregate – How to aggregate TCAV values. Possible values: [tcav_original, none] (default: tcav_original, which counts number of positive TCAV values)

Returns:

(Aggregated) TCAV scores

Distances#

tsx.distances.dtw(s: numpy.ndarray | torch.Tensor, t: numpy.ndarray | torch.Tensor)[source]#

Dynamic Time Warping from fastdtw package

Parameters:
  • s – First input

  • t – Second input

Returns:

Calculated distance (float)

tsx.distances.euclidean(s: numpy.ndarray | torch.Tensor, t: numpy.ndarray | torch.Tensor)[source]#
Parameters:
  • s – First input

  • t – Second input

Returns:

Calculated distance (float)

Metrics#

tsx.metrics.mase(y_pred, y_true, X)[source]#

Compute MASE value

Parameters:
  • y_pred – Predicted values

  • y_true – True values

  • X – Background time series to compute one-step-ahead repeated forecasts

Returns:

MASE value that is 1 if forecast is equal to repeated value baseline, <1 if better. Always >= 0.

tsx.metrics.entropy(P, scale=True)[source]#

Compute (scaled) entropy

Parameters:
  • P – Input to entropy

  • scale – If True, scale entropy to be in [0,1] (default: True)

Returns:

(Scaled) entropy

Utilities#

tsx.utils.to_random_state(rs: int | None | numpy.random.Generator)[source]#

Return np.random.Generator object from input

Parameters:

rs – Something that np.random.default_rng can process.

Returns:

A np.random.default_rng object

tsx.utils.get_device()[source]#
Return the “best” device in the following order:

If a GPU is available, return “cuda”. If Metal is available, return “mps” Otherwise, return “cpu”

Returns:

String indicating best possible device for Torch Tensors