API#
This page documents the classes and methods contained in the tsx
package.
Models#
- class tsx.models.base.NeuralNetRegressor[source]#
Regression wrapper for scikit-learn-like PyTorch training
- Parameters:
module – A PyTorch model of type
torch.nn.Module
random_state (optional) – Seed training with either a fixed seed or None. Defaults to None.
max_epochs (optional) – How long to train for. Defaults to 10.
device (optional) – Indicate where the model should be trained. If None, it chooses the fastest available option automatically. Defaults to None.
lr (optional) – Set learning rate. Defaults to 2e-3
batch_size (optional) – Training batch size. Defaults to 32.
verbose (optional) – Print status updates to the console. Defaults to false
callbacks (optional) – Skorch callback list used for training. Defaults to None.
**kwargs – Optional keyword arguments for skorch.NeuralNetRegressor
- class tsx.models.base.NeuralNetClassifier[source]#
Classification wrapper for scikit-learn-like PyTorch training
- Parameters:
module – A PyTorch model of type
torch.nn.Module
random_state (optional) – Seed training with either a fixed seed or None. Defaults to None.
max_epochs (optional) – How long to train for. Defaults to 10.
device (optional) – Indicate where the model should be trained. If None, it chooses the fastest available option automatically. Defaults to None.
lr (optional) – Set learning rate. Defaults to 2e-3
batch_size (optional) – Training batch size. Defaults to 32.
verbose (optional) – Print status updates to the console. Defaults to false
callbacks (optional) – Skorch callback list used for training. Defaults to None.
**kwargs – Optional keyword arguments for skorch.NeuralNetClassifier
- class tsx.models.sdt.SoftDecisionTreeClassifier[source]#
Soft Decision Tree, configured as a classifier
- Parameters:
n_features – Number of input features
depth – Fixed depth of the tree
- predict(X)#
Predict function
- Parameters:
X (torch.Tensor) – Input tensor of size (batch_size, n_features)
- class tsx.models.sdt.SoftEnsembleClassifier[source]#
Ensemble of Soft Decision Trees for classification
- Parameters:
n_trees – Number of ensemble member
n_features – Number of input features
depth – Fixed depth of the tree
- class tsx.models.sdt.SoftDecisionTreeRegressor[source]#
Soft Decision Tree, configured as a regressor
- Parameters:
n_features – Number of input features
depth – Fixed depth of the tree
- predict(X)#
Predict function
- Parameters:
X (torch.Tensor) – Input tensor of size (batch_size, n_features)
- class tsx.models.sdt.SoftEnsembleRegressor[source]#
Ensemble of Soft Decision Trees for regression
- Parameters:
n_trees – Number of ensemble member
n_features – Number of input features
depth – Fixed depth of the tree
- class tsx.models.forecaster.ospgsm.OS_PGSM(pool, L, context_size, detect_concept_drift=True, threshold=0.5, min_roc_size=2, random_state=0)[source]#
Improved version of the OS-PGSM algorithm originally presented in Saadallah, A., Jakobs, M., Morik, K. (2021). Explainable Online Deep Neural Network Selection Using Adaptive Saliency Maps for Time Series Forecasting. In: Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2021. https://doi.org/10.1007/978-3-030-86486-6_25
- Parameters:
pool (List of nn.Module) – List of pretrained neural network models. Each model must contain a feature_extractor and forecaster submodule for OS-PGSM to work.
L (int) – The amount of lag used to train the pool models
context_size (int) – Size of the chunks from which Regions of Competence are created
detect_concept_drift (bool) – Whether or not to enable concept drift detection
threshold (float) – Minimum value indicating when a step is salient enough to be part of the Region of Competence
min_roc_size (int) – RoC member smaller than this value are not added to the RoC
random_state (int) – Random state seeding the run method
- run(X_val, X_test)[source]#
Main method for running the prediction
- Parameters:
X_val (torch.Tensor) – Univariate validation time series from which the Regions of Competence are created
X_test (torch.Tensor) – Univariate test time series which should be forecasted
- Returns:
Prediction tensor with the same length as X_test
- class tsx.models.forecaster.ospgsm.OEP_ROC(pool, L, context_size, context_step, nr_clusters_ensemble=15, dist_fn=<function euclidean>, detect_concept_drift=True, threshold=0.1, min_roc_size=2, random_state=0)[source]#
Improved version of the OEP-ROC algorithm originally presented in Saadallah, A., Jakobs, M. & Morik, K. Explainable online ensemble of deep neural network pruning for time series forecasting. Mach Learn 111, 3459–3487 (2022)
- Parameters:
pool (List of nn.Module) – List of pretrained neural network models. Each model must contain a feature_extractor and forecaster submodule for OS-PGSM to work.
L (int) – The amount of lag used to train the pool models
context_size (int) – Size of the chunks from which Regions of Competence are created
context_step (int) – Step size between the context_size chunks
nr_clusters_ensemble (int) – How many cluster centers to use
dist_fn (callable) – A distance function defined on two time series windows
detect_concept_drift (bool) – Whether or not to enable concept drift detection
threshold (float) – Minimum value indicating when a step is salient enough to be part of the Region of Competence
min_roc_size (int) – RoC member smaller than this value are not added to the RoC
random_state (int) – Random state seeding the run method
- run(X_val, X_test)[source]#
Main method for running the prediction
- Parameters:
X_val (torch.Tensor) – Univariate validation time series from which the Regions of Competence are created
X_test (torch.Tensor) – Univariate test time series which should be forecasted
- Returns:
Prediction tensor with the same length as X_test
Datasets#
Monash Forecasting Repository#
- tsx.datasets.monash.load_monash(dataset: str, return_pytorch: bool = False, return_numpy: bool = False, return_horizon: bool = False)[source]#
Loads datasets from Monash Time Series Forecasting Repository.
- Parameters:
dataset – Name of the dataset to be downloaded. Consists of the name of the dataset as well as the “version” of the dataset separated by an underscore.
return_horizon – Datasets have a specific forecast horizon. True if they should be returned as well.
return_pytorch – Returns dataset as a PyTorch tensor. Throws error if not possible.
return_numpy – Returns dataset as a numpy array. Throws error if not possible.
Jena Climate Dataset#
- tsx.datasets.jena.load_jena(full_features: bool = False, resample: str = '60T', return_numpy: bool = False, return_pytorch: bool = False)[source]#
Returns the Jena Climate 2009 - 2016 dataset
- Parameters:
full_feature – return all features (true) or selection of informative features
resample – string in pandas resample notation
return_numpy – returns dataset as a numpy array
return_pytorch – returns dataset as a pytorch tensor
Utilities#
- tsx.datasets.utils.windowing(x: numpy.ndarray | torch.Tensor, L: int, z: int = 1, H: int = 1, use_torch: bool = False)[source]#
Create sliding windows from input x
- Parameters:
x – Input time series
L – Amount of lag to use
H – Forecast horizon
z – Step length
use_torch – Whether to return np.ndarray or torch.Tensor
- Returns:
Windowed X and y, either as a Numpy array or PyTorch tensor
- tsx.datasets.utils.split_horizon(x: numpy.ndarray | torch.Tensor, H: int, L: None | int = None)[source]#
Split a time series into two parts, given a forecasting horizon
- Parameters:
x – Input time series
H – Forecast horizon
L (optional) – Amount of lag to use
- Returns:
Two arrays (type depends on the type of x), the first one corresponding to everything before H
Model selection and ensembling#
- class tsx.model_selection.ROC_Member(x, y, indices, squared_error)[source]#
Object representing a member of a Region of Competence
- Parameters:
x (np.ndarray) – Original time series values
y (np.ndarray) – Corresponding true forecasting values
indices (np.ndarray) – Indices indicating the salient region
squared_error (float) – Squared error achieved by the model
- Attributes:
r (np.ndarray) – Most salient subseries of x
x (np.ndarray) – Original time series values
y (np.ndarray) – Corresponding true forecasting values
indices (np.ndarray) – Indices indicating the salient region
- tsx.model_selection.roc_tools.find_best_forecaster(x: torch.Tensor, rocs: List[List[ROC_Member]], pool: List[torch.nn.Module], dist_fn: callable, topm: int = 1)[source]#
Given an input x, a pool of pretrained models pool and corresponding RoCs rocs return the topm best forecasters according to distance measure dist_fn
- Parameters:
x – Input time series window
pool – List of pretrained models
rocs – List of Regions of Competences
dist_fn – Distance function applicable for two time series windows
topm – How many models to return
- Returns:
A np.ndarray of the topm best model indices from pool
- tsx.model_selection.roc_tools.find_closest_rocs(x: torch.Tensor, rocs: List[List[torch.Tensor]], dist_fn: callable)[source]#
Given an input x and RoCs rocs return the closest RoC member for each model w.r.t. dist_fn
- Parameters:
x – Input time series window
rocs – List of Regions of Competences
dist_fn – Distance function applicable for two time series windows
- Returns:
A list of model indices and a list of correpsonding closest ROC_Member objects
- class tsx.model_selection.ADE(random_state=None)[source]#
Reimplementation of ADE from from https://link.springer.com/article/10.1007/s10994-018-05774-y
- Parameters:
random_state – Input to to_random_state
- run(X_train, y_train, train_preds, X_test, y_test, test_preds, _omega=0.5, _lambda=50, only_best=False)[source]#
Compute model selection and prediction
- Parameters:
X_train – Input for training meta learners
y_train – Label for training meta learners
train_preds – shape (n_learner, T_train) predictions on training data for each model X_test: Test input data
X_test – Test inputs
y_test – Test labels
test_preds – shape (n_learner, T_test) predictions on test data for each model
_omega – Committee ratio
_lambda – Window size (how much old data timesteps to include for penalty)
only_best – If True, return only best model. Otherwise, return ensemble weights (default: False)
- Returns:
Tuple of predictions and weights. weights is a list of indices if only_best==True
- class tsx.model_selection.DETS[source]#
Reimplementation of DETS from from https://ieeexplore.ieee.org/abstract/document/8259783?casa_token=yA69YjHH3OEAAAAA:KSJg6CPyOOOC2KkbypuUA0BEPjuUNsqcgHVHDCM3sxHH4p0jMfnq8Ev1-JYGEHy56x7CI1gCZQ
- run(X_train, y_train, train_preds, X_test, y_test, test_preds, P=50, _lambda=0.5, only_best=False)[source]#
Compute model selection and prediction
- Parameters:
X_train – Input for training meta learners
y_train – Label for training meta learners
train_preds – shape (n_learner, T_train) predictions on training data for each model X_test: Test input data
X_test – Test input
y_test – Test labels
test_preds – shape (n_learner, T_test) predictions on test data for each model
_lambda – Committee ratio
P – Window size (how much old data timesteps to include for penalty)
only_best – If True, return only best model. Otherwise, return ensemble weights (default: False)
- Returns:
Tuple of predictions and weights. weights is a list of indices if only_best==True
- class tsx.model_selection.KNNRoC(pool, random_state=None)[source]#
Train KNN classifier based on Regions of Competence
- Parameters:
pool – Pool of pretrained models to do forecasting
random_state – Valid input to to_random_state
Concepts#
Base functions#
- tsx.concepts.n_uniques(A, L)[source]#
Get the number of unique scale-invariant time series concepts of length L and alphabet size A
- Parameters:
A – Alphabet size
L – Length of time series
- Returns:
Number of unique scale-invariant concepts
- tsx.concepts.generate_unique_concepts(L, A)[source]#
Generate all unique scale-invariant time series concepts as string representation given L and A
- Parameters:
A – Alphabet size
L – Length of time series
- Returns:
List of unique scale-invariant concepts as string representations
- tsx.concepts.generate_all_concepts(L, A)[source]#
Generate all time series concepts as string representation given L and A
- Parameters:
A – Alphabet size
L – Length of time series
- Returns:
List of all possible concepts as string representations
- tsx.concepts.generate_samples(concept_key, size, A, random_state=None)[source]#
Generate size random samples for given concept concept_key
- Parameters:
concept_key – String representation of desired concept
size – Number of desired samples
A – Alphabet size
random_state – Valid input to to_random_state
- Returns:
Numpy array of size size of samples from concept concept_key
- tsx.concepts.find_closest_concepts(X, concepts)[source]#
Find closest concepts in concepts for each time series in X
- Parameters:
X – 2d numpy array of time series datapoints
concepts – 2d numpy array of concept samples
- Returns:
List of indices into concepts of size len(X), indicating closest concept for each entry
- tsx.concepts.get_concept_distributions(X, concepts, normalize=True)[source]#
Calculate empricial distribution, given concepts, over X
- Parameters:
X – 2d numpy array of time series datapoints
concepts – 2d numpy array of concept samples
normalize – Whether or not the distribution should be normalized or should encode total counts
- Returns:
List of indices into concepts of size len(X), indicating closest concept for each entry
TCAV#
- tsx.concepts.get_cavs(model, concepts, A, size_per_concept=20, return_lms=False, verbose=False, random_state=None)[source]#
Get Class Activation Vectors of concepts for model
- Parameters:
model – Pytorch model with method get_activation, returning a latent representation
concepts – 2d numpy array of concept samples
A – Alphabet size
size_per_concept – How many samples per concept to generate for training linear models
return_lms – If True, return list of sklearn.linear_model.LogisticRegression instead of just CAVs (default: False)
verbose – If True, print linear model F1 score per concept (default: False)
random_state – Input to to_random_state
- Returns:
Either a list of Class Activation Vectors or a list of sklearn.linear_model.LogisticRegression models
- tsx.concepts.get_tcav(model, cavs, X, y=None, aggregate='tcav_original')[source]#
Get TCAV values of cavs for model
- Parameters:
model – Pytorch model with method get_activation, returning a latent representation and model.predictor returning a point forecast
cavs – List of Class Activation Vectors from get_cavs
X – 2D numpy array of input data
y – If not None, return TCAV values for squared error (default: None)
aggregate – How to aggregate TCAV values. Possible values: [tcav_original, none] (default: tcav_original, which counts number of positive TCAV values)
- Returns:
(Aggregated) TCAV scores
Distances#
Metrics#
- tsx.metrics.mase(y_pred, y_true, X)[source]#
Compute MASE value
- Parameters:
y_pred – Predicted values
y_true – True values
X – Background time series to compute one-step-ahead repeated forecasts
- Returns:
MASE value that is 1 if forecast is equal to repeated value baseline, <1 if better. Always >= 0.