5.4. UCTB.model package

5.4.1. UCTB.model.ARIMA module

class UCTB.model.ARIMA.ARIMA(time_sequence, order=None, seasonal_order=(0, 0, 0, 0), max_ar=6, max_ma=4, max_d=2)

Bases: object

ARIMA is a generalization of an ARMA (Autoregressive Moving Average) model, used in predicting future points in time series analysis.

Since there may be three kinds of series data as closeness, period and trend history, this class trains three different ARIMA models for each node according to the three kinds of history data, and returns average of the predicted values by the models in prediction.

Parameters
  • time_sequence (array_like) – The observation value of time_series.

  • order (iterable) – It stores the (p, d, q) orders of the model for the number of AR parameters , differences, MA parameters. If set to None, ARIMA class will calculate the orders for each series based on max_ar, max_ma and max_d. Default: None

  • seasonal_order (iterable) – It stores the (P,D,Q,s) order of the seasonal ARIMA model for the AR parameters, differences, MA parameters, and periodicity. s is an integer giving the periodicity (number of periods in season).

  • max_ar (int) – Maximum number of AR lags to use. Default: 6

  • max_ma (int) – Maximum number of MA lags to use. Default: 4

  • max_d (int) – Maximum number of degrees of differencing. Default: 2

Attribute:

order(iterable): (p, d, q) orders for ARIMA model. seasonal_order(iterable): (P,D,Q,s) order for seasonal ARIMA model. model_res(): Fit method for likelihood based models.

static adf_test(time_series, max_lags=None, verbose=True)

Augmented Dickey–Fuller test. The Augmented Dickey-Fuller test can be used to test for a unit root in a univariate process in the presence of serial correlation.

get_order(series, order=None, max_ar=6, max_ma=2, max_d=2)

If order is None, it simply returns order, otherwise, it calculates the (p, d, q) orders for the series data based on max_ar, max_ma and max_d.

predict(time_sequences, forecast_step=1)
Argues:

time_sequences: The input time_series features. forecast_step: The number of predicted future steps. Default: 1

Returns

Prediction results with shape of (len(time_sequence)/forecast_step,forecast_step=,1).

Type

np.ndarray

5.4.2. UCTB.model.DCRNN module

class UCTB.model.DCRNN.DCRNN(num_nodes, num_diffusion_matrix, num_rnn_units=64, num_rnn_layers=1, max_diffusion_step=2, seq_len=6, use_curriculum_learning=False, input_dim=1, output_dim=1, cl_decay_steps=1000, target_len=1, lr=0.0001, epsilon=0.001, optimizer_name='Adam', code_version='DCRNN-QuickStart', model_dir='model_dir', gpu_device='0', **kwargs)

Bases: UCTB.model_unit.BaseModel.BaseModel

References

Parameters
  • num_nodes (int) – Number of nodes in the graph, e.g. number of stations in NYC-Bike dataset.

  • num_diffusion_matrix – Number of diffusion matrix used in model.

  • num_rnn_units – Number of RNN units.

  • num_rnn_layers – Number of RNN layers

  • max_diffusion_step – Number of diffusion steps

  • seq_len – Input sequence length

  • use_curriculum_learning (bool) – model’s prediction (True) or the previous ground truth in training (False).

  • input_dim – Dimension of the input feature

  • output_dim – Dimension of the output feature

  • cl_decay_steps – When use_curriculum_learning=True, cl_decay_steps is used to adjust the ratio of using ground true labels, where with more training steps, the ratio drops.

  • target_len (int) – Output sequence length.

  • lr (float) – Learning rate

  • epsilon – epsilon in Adam

  • optimizer_name (str) – ‘sgd’ or ‘Adam’ optimizer

  • code_version (str) – Current version of this model code, which will be used as filename for saving the model

  • model_dir (str) – The directory to store model files. Default:’model_dir’.

  • gpu_device (str) – To specify the GPU to use. Default: ‘0’.

build(init_vars=True, max_to_keep=5)

Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.

5.4.3. UCTB.model.DeepST module

class UCTB.model.DeepST.DeepST(closeness_len, period_len, trend_len, width, height, external_dim, kernel_size=3, num_conv_filters=64, lr=1e-05, code_version='QuickStart-DeepST', model_dir='model_dir', gpu_device='0')

Bases: UCTB.model_unit.BaseModel.BaseModel

Deep learning-based prediction model for Spatial-Temporal data (DeepST)

DeepST is composed of three components: 1) temporal dependent instances: describing temporal closeness, period and seasonal trend; 2) convolutional neural networks: capturing near and far spatial dependencies; 3) early and late fusions: fusing similar and different domains’ data.

Reference:
Parameters
  • closeness_len (int) – The length of closeness data history. The former consecutive closeness_len time slots of data will be used as closeness history.

  • period_len (int) – The length of period data history. The data of exact same time slots in former consecutive period_len days will be used as period history.

  • trend_len (int) – The length of trend data history. The data of exact same time slots in former consecutive trend_len weeks (every seven days) will be used as trend history.

  • width (int) – The width of grid data.

  • height (int) – The height of grid data.

  • externai_dim (int) – Number of dimensions of external data.

  • kernel_size (int) – Kernel size in Convolutional neural networks. Default: 3

  • num_conv_filters (int) – the Number of filters in the convolution. Default: 64

  • lr (float) – Learning rate. Default: 1e-5

  • code_version (str) – Current version of this model code.

  • model_dir (str) – The directory to store model files. Default:’model_dir’

  • gpu_device (str) – To specify the GPU to use. Default: ‘0’

build()

Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.

5.4.4. UCTB.model.GeoMAN module

class UCTB.model.GeoMAN.GeoMAN(total_sensers, input_dim, external_dim, output_dim, input_steps, output_steps, n_stacked_layers=2, n_encoder_hidden_units=128, n_decoder_hidden_units=128, dropout_rate=0.3, lr=0.001, gc_rate=2.5, code_version='GeoMAN-QuickStart', model_dir='model_dir', gpu_device='0', **kwargs)

Bases: UCTB.model_unit.BaseModel.BaseModel

Multi-level Attention Networks for Geo-sensory Time Series Prediction (GeoMAN)

GeoMAN consists of two major parts: 1) A multi-level attention mechanism (including both local and global spatial attentions in encoder and temporal attention in decoder) to model the dynamic spatio-temporal dependencies; 2) A general fusion module to incorporate the external factors from different domains (e.g., meteorology, time of day and land use).

References

Parameters
  • total_sensers (int) – The number of total sensors used in global attention mechanism.

  • input_dim (int) – The number of dimensions of the target sensor’s input.

  • external_dim (int) – The number of dimensions of the external features.

  • output_dim (int) – The number of dimensions of the target sensor’s output.

  • input_steps (int) – The length of historical input data, a.k.a, input timesteps.

  • output_steps (int) – The number of steps that need prediction by one piece of history data, a.k.a, output timesteps. Have to be 1 now.

  • n_stacked_layers (int) – The number of LSTM layers stacked in both encoder and decoder (These two are the same). Default: 2

  • n_encoder_hidden_units (int) – The number of hidden units in each layer of encoder. Default: 128

  • n_decoder_hidden_units (int) – The number of hidden units in each layer of decoder. Default: 128

  • dropout_rate (float) – Dropout rate of LSTM layers in both encoder and decoder. Default: 0.3

  • lr (float) – Learning rate. Default: 0.001

  • gc_rate (float) – A clipping ratio for all the gradients. This operation normalizes all gradients so that their L2-norms are less than or equal to gc_rate. Default: 2.5

  • code_version (str) – Current version of this model code. Default: ‘GeoMAN-QuickStart’

  • model_dir (str) – The directory to store model files. Default:’model_dir’

  • gpu_device (str) – To specify the GPU to use. Default: ‘0’

  • **kwargs (dict) – Reserved for future use. May be used to pass parameters to class BaseModel.

build(init_vars=True, max_to_keep=5)

Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.

UCTB.model.GeoMAN.input_transform(local_features, global_features, external_features, targets)

Split the model’s inputs from matrices to lists on timesteps axis.

UCTB.model.GeoMAN.split_timesteps(inputs)

Split the input matrix from (batch, timesteps, input_dim) to a step list ([[batch, input_dim], …, ]).

5.4.5. UCTB.model.HM module

class UCTB.model.HM.HM(c, p, t)

Bases: object

Historical Mean. A naive method that simply return average of hisrory data of each time slot.

Parameters
  • c (int) – The number of time slots of closeness history.

  • p (int) – The number of time slots of period history which presents daily feature.

  • t (int) – The number of time slots of trend history which presents weekly feature.

  • that ` (Note) –

  • should be considerd in average. (features) –

predict(closeness_feature, period_feature, trend_feature)

Give closeness, period and trend history values and then use their averages as predict.

5.4.6. UCTB.model.STMeta module

class UCTB.model.STMeta.STMeta(num_node, external_dim, closeness_len, period_len, trend_len, num_graph=1, gcn_k=1, gcn_layers=1, gclstm_layers=1, num_hidden_units=64, num_dense_units=32, graph_merge_gal_units=32, graph_merge_gal_num_heads=2, temporal_merge_gal_units=64, temporal_merge_gal_num_heads=2, st_method='GCLSTM', temporal_merge='gal', graph_merge='gal', output_activation=<function sigmoid>, lr=0.0001, code_version='STMeta-QuickStart', model_dir='model_dir', gpu_device='0', **kwargs)

Bases: UCTB.model_unit.BaseModel.BaseModel

Parameters
  • num_node (int) – Number of nodes in the graph, e.g. number of stations in NYC-Bike dataset.

  • external_dim (int) – Dimension of the external feature, e.g. temperature and wind are two dimension.

  • closeness_len (int) – The length of closeness data history. The former consecutive closeness_len time slots

  • data will be used as closeness history. (of) –

  • period_len (int) – The length of period data history. The data of exact same time slots in former consecutive

  • days will be used as period history. (period_len) –

  • trend_len (int) – The length of trend data history. The data of exact same time slots in former consecutive

  • weeks (trend_len) –

  • num_graph (int) – Number of graphs used in STMeta.

  • gcn_k (int) – The highest order of Chebyshev Polynomial approximation in GCN.

  • gcn_layers (int) – Number of GCN layers.

  • gclstm_layers (int) – Number of STRNN layers, it works on all modes of STMeta such as GCLSTM and DCRNN.

  • num_hidden_units (int) – Number of hidden units of RNN.

  • num_dense_units (int) – Number of dense units.

  • graph_merge_gal_units (int) – Number of units in GAL for merging different graph features. Only works when graph_merge=’gal’

  • graph_merge_gal_num_heads (int) – Number of heads in GAL for merging different graph features. Only works when graph_merge=’gal’

  • temporal_merge_gal_units (int) – Number of units in GAL for merging different temporal features. Only works when temporal_merge=’gal’

  • temporal_merge_gal_num_heads (int) – Number of heads in GAL for merging different temporal features. Only works when temporal_merge=’gal’

  • st_method (str) – must in [‘GCLSTM’, ‘DCRNN’, ‘GRU’, ‘LSTM’], which refers to different spatial-temporal modeling methods. ‘GCLSTM’: GCN for modeling spatial feature, LSTM for modeling temporal feature. ‘DCRNN’: Diffusion Convolution for modeling spatial feature, GRU for modeling temporam frature. ‘GRU’: Ignore the spatial, and model the temporal feature using GRU ‘LSTM’: Ignore the spatial, and model the temporal feature using LSTM

  • temporal_merge (str) – must in [‘gal’, ‘concat’], refers to different temporal merging methods, ‘gal’: merge using GAL, ‘concat’: merge by concat and dense

  • graph_merge (str) – must in [‘gal’, ‘concat’], refers to different graph merging methods, ‘gal’: merge using GAL, ‘concat’: merge by concat and dense

  • output_activation (function) – activation function, e.g. tf.nn.tanh

  • lr (float) – Learning rate. Default: 1e-5

  • code_version (str) – Current version of this model code, which will be used as filename for saving the model

  • model_dir (str) – The directory to store model files. Default:’model_dir’.

  • gpu_device (str) – To specify the GPU to use. Default: ‘0’.

build(init_vars=True, max_to_keep=5)

Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.

5.4.7. UCTB.model.ST_MGCN module

class UCTB.model.ST_MGCN.ST_MGCN(T, input_dim, num_graph, gcl_k, gcl_l, lstm_units, lstm_layers, lr, external_dim, code_version, model_dir, gpu_device)

Bases: UCTB.model_unit.BaseModel.BaseModel

References

Parameters
  • T (int) – Input sequence length

  • input_dim (int) – Input feature dimension

  • num_graph (int) – Number of graphs used in the model.

  • gcl_k (int) – The highest order of Chebyshev Polynomial approximation in GCN.

  • gcl_l (int) – Number of GCN layers.

  • lstm_units (int) – Number of hidden units of RNN.

  • lstm_layers (int) – Number of LSTM layers.

  • lr (float) – Learning rate

  • external_dim (int) – Dimension of the external feature, e.g. temperature and wind are two dimension.

  • code_version (str) – Current version of this model code, which will be used as filename for saving the model

  • model_dir (str) – The directory to store model files. Default:’model_dir’.

  • gpu_device (str) – To specify the GPU to use. Default: ‘0’.

build(init_vars=True, max_to_keep=5)

Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.

5.4.8. UCTB.model.ST_ResNet module

class UCTB.model.ST_ResNet.ST_ResNet(width, height, external_dim, closeness_len, period_len, trend_len, num_residual_unit=4, kernel_size=3, lr=5e-05, model_dir='model_dir', code_version='QuickStart', conv_filters=64, gpu_device='0')

Bases: UCTB.model_unit.BaseModel.BaseModel

ST-ResNet is a deep-learning model with an end-to-end structure based on unique properties of spatio-temporal data making use of convolution and residual units.

References

Parameters
  • width (int) – The width of grid data.

  • height (int) – The height of grid data.

  • externai_dim (int) – Number of dimensions of external data.

  • closeness_len (int) – The length of closeness data history. The former consecutive closeness_len time slots of data will be used as closeness history.

  • period_len (int) – The length of period data history. The data of exact same time slots in former consecutive period_len days will be used as period history.

  • trend_len (int) – The length of trend data history. The data of exact same time slots in former consecutive trend_len weeks (every seven days) will be used as trend history.

  • num_residual_unit (int) – Number of residual units. Default: 4

  • kernel_size (int) – Kernel size in Convolutional neural networks. Default: 3

  • lr (float) – Learning rate. Default: 1e-5

  • code_version (str) – Current version of this model code.

  • model_dir (str) – The directory to store model files. Default:’model_dir’

  • conv_filters (int) – the Number of filters in the convolution. Default: 64

  • gpu_device (str) – To specify the GPU to use. Default: ‘0’

build()

Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.

5.4.9. UCTB.model.XGBoost module

class UCTB.model.XGBoost.XGBoost(n_estimators=10, max_depth=5, verbosity=0, objective='reg:squarederror', eval_metric='rmse')

Bases: object

XGBoost is an optimized distributed gradient boosting machine learning algorithm.

Parameters
  • *n_estimators (int) – Number of boosting iterations. Default: 10

  • *max_depth (int) – Maximum tree depth for base learners. Default: 5

  • *verbosity (int) – The degree of verbosity. Valid values are 0 (silent) - 3 (debug). Default: 0

  • *objective (string or callable) – Specify the learning task and the corresponding learning objective or a custom objective function to be used. Default: 'reg:squarederror'

  • *eval_metric (str, list of str, or callable, optional) – If a str, should be a built-in evaluation metric to use. See more in API Reference of XGBoost Library. Default: 'rmse'

fit(X, y)

Training method.

Parameters
  • X (np.ndarray/scipy.sparse/pd.DataFrame/dt.Frame) – The training input samples.

  • y (np.ndarray, optional) – The target values of training samples.

predict(X)

Prediction method.

Returns

Predicted values with shape as [time_slot_num, node_num, 1].

Return type

np.ndarray