5.4. UCTB.model package¶
5.4.1. UCTB.model.ARIMA module¶
-
class
UCTB.model.ARIMA.
ARIMA
(time_sequence, order=None, seasonal_order=(0, 0, 0, 0), max_ar=6, max_ma=4, max_d=2)¶ Bases:
object
ARIMA is a generalization of an ARMA (Autoregressive Moving Average) model, used in predicting future points in time series analysis.
Since there may be three kinds of series data as closeness, period and trend history, this class trains three different ARIMA models for each node according to the three kinds of history data, and returns average of the predicted values by the models in prediction.
- Parameters
time_sequence (array_like) – The observation value of time_series.
order (iterable) – It stores the (p, d, q) orders of the model for the number of AR parameters , differences, MA parameters. If set to None, ARIMA class will calculate the orders for each series based on max_ar, max_ma and max_d. Default: None
seasonal_order (iterable) – It stores the (P,D,Q,s) order of the seasonal ARIMA model for the AR parameters, differences, MA parameters, and periodicity. s is an integer giving the periodicity (number of periods in season).
max_ar (int) – Maximum number of AR lags to use. Default: 6
max_ma (int) – Maximum number of MA lags to use. Default: 4
max_d (int) – Maximum number of degrees of differencing. Default: 2
- Attribute:
order(iterable): (p, d, q) orders for ARIMA model. seasonal_order(iterable): (P,D,Q,s) order for seasonal ARIMA model. model_res(): Fit method for likelihood based models.
-
static
adf_test
(time_series, max_lags=None, verbose=True)¶ Augmented Dickey–Fuller test. The Augmented Dickey-Fuller test can be used to test for a unit root in a univariate process in the presence of serial correlation.
-
get_order
(series, order=None, max_ar=6, max_ma=2, max_d=2)¶ If order is None, it simply returns order, otherwise, it calculates the (p, d, q) orders for the series data based on max_ar, max_ma and max_d.
-
predict
(time_sequences, forecast_step=1)¶ - Argues:
time_sequences: The input time_series features. forecast_step: The number of predicted future steps. Default: 1
- Returns
Prediction results with shape of (len(time_sequence)/forecast_step,forecast_step=,1).
- Type
np.ndarray
5.4.2. UCTB.model.DCRNN module¶
-
class
UCTB.model.DCRNN.
DCRNN
(num_nodes, num_diffusion_matrix, num_rnn_units=64, num_rnn_layers=1, max_diffusion_step=2, seq_len=6, use_curriculum_learning=False, input_dim=1, output_dim=1, cl_decay_steps=1000, target_len=1, lr=0.0001, epsilon=0.001, optimizer_name='Adam', code_version='DCRNN-QuickStart', model_dir='model_dir', gpu_device='0', **kwargs)¶ Bases:
UCTB.model_unit.BaseModel.BaseModel
References
- Parameters
num_nodes (int) – Number of nodes in the graph, e.g. number of stations in NYC-Bike dataset.
num_diffusion_matrix – Number of diffusion matrix used in model.
num_rnn_units – Number of RNN units.
num_rnn_layers – Number of RNN layers
max_diffusion_step – Number of diffusion steps
seq_len – Input sequence length
use_curriculum_learning (bool) – model’s prediction (True) or the previous ground truth in training (False).
input_dim – Dimension of the input feature
output_dim – Dimension of the output feature
cl_decay_steps – When use_curriculum_learning=True, cl_decay_steps is used to adjust the ratio of using ground true labels, where with more training steps, the ratio drops.
target_len (int) – Output sequence length.
lr (float) – Learning rate
epsilon – epsilon in Adam
optimizer_name (str) – ‘sgd’ or ‘Adam’ optimizer
code_version (str) – Current version of this model code, which will be used as filename for saving the model
model_dir (str) – The directory to store model files. Default:’model_dir’.
gpu_device (str) – To specify the GPU to use. Default: ‘0’.
-
build
(init_vars=True, max_to_keep=5)¶ Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.
5.4.3. UCTB.model.DeepST module¶
-
class
UCTB.model.DeepST.
DeepST
(closeness_len, period_len, trend_len, width, height, external_dim, kernel_size=3, num_conv_filters=64, lr=1e-05, code_version='QuickStart-DeepST', model_dir='model_dir', gpu_device='0')¶ Bases:
UCTB.model_unit.BaseModel.BaseModel
Deep learning-based prediction model for Spatial-Temporal data (DeepST)
DeepST is composed of three components: 1) temporal dependent instances: describing temporal closeness, period and seasonal trend; 2) convolutional neural networks: capturing near and far spatial dependencies; 3) early and late fusions: fusing similar and different domains’ data.
- Parameters
closeness_len (int) – The length of closeness data history. The former consecutive
closeness_len
time slots of data will be used as closeness history.period_len (int) – The length of period data history. The data of exact same time slots in former consecutive
period_len
days will be used as period history.trend_len (int) – The length of trend data history. The data of exact same time slots in former consecutive
trend_len
weeks (every seven days) will be used as trend history.width (int) – The width of grid data.
height (int) – The height of grid data.
externai_dim (int) – Number of dimensions of external data.
kernel_size (int) – Kernel size in Convolutional neural networks. Default: 3
num_conv_filters (int) – the Number of filters in the convolution. Default: 64
lr (float) – Learning rate. Default: 1e-5
code_version (str) – Current version of this model code.
model_dir (str) – The directory to store model files. Default:’model_dir’
gpu_device (str) – To specify the GPU to use. Default: ‘0’
-
build
()¶ Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.
5.4.4. UCTB.model.GeoMAN module¶
-
class
UCTB.model.GeoMAN.
GeoMAN
(total_sensers, input_dim, external_dim, output_dim, input_steps, output_steps, n_stacked_layers=2, n_encoder_hidden_units=128, n_decoder_hidden_units=128, dropout_rate=0.3, lr=0.001, gc_rate=2.5, code_version='GeoMAN-QuickStart', model_dir='model_dir', gpu_device='0', **kwargs)¶ Bases:
UCTB.model_unit.BaseModel.BaseModel
Multi-level Attention Networks for Geo-sensory Time Series Prediction (GeoMAN)
GeoMAN consists of two major parts: 1) A multi-level attention mechanism (including both local and global spatial attentions in encoder and temporal attention in decoder) to model the dynamic spatio-temporal dependencies; 2) A general fusion module to incorporate the external factors from different domains (e.g., meteorology, time of day and land use).
References
- Parameters
total_sensers (int) – The number of total sensors used in global attention mechanism.
input_dim (int) – The number of dimensions of the target sensor’s input.
external_dim (int) – The number of dimensions of the external features.
output_dim (int) – The number of dimensions of the target sensor’s output.
input_steps (int) – The length of historical input data, a.k.a, input timesteps.
output_steps (int) – The number of steps that need prediction by one piece of history data, a.k.a, output timesteps. Have to be 1 now.
n_stacked_layers (int) – The number of LSTM layers stacked in both encoder and decoder (These two are the same). Default: 2
n_encoder_hidden_units (int) – The number of hidden units in each layer of encoder. Default: 128
n_decoder_hidden_units (int) – The number of hidden units in each layer of decoder. Default: 128
dropout_rate (float) – Dropout rate of LSTM layers in both encoder and decoder. Default: 0.3
lr (float) – Learning rate. Default: 0.001
gc_rate (float) – A clipping ratio for all the gradients. This operation normalizes all gradients so that their L2-norms are less than or equal to
gc_rate
. Default: 2.5code_version (str) – Current version of this model code. Default: ‘GeoMAN-QuickStart’
model_dir (str) – The directory to store model files. Default:’model_dir’
gpu_device (str) – To specify the GPU to use. Default: ‘0’
**kwargs (dict) – Reserved for future use. May be used to pass parameters to class
BaseModel
.
-
build
(init_vars=True, max_to_keep=5)¶ Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.
-
UCTB.model.GeoMAN.
input_transform
(local_features, global_features, external_features, targets)¶ Split the model’s inputs from matrices to lists on timesteps axis.
-
UCTB.model.GeoMAN.
split_timesteps
(inputs)¶ Split the input matrix from (batch, timesteps, input_dim) to a step list ([[batch, input_dim], …, ]).
5.4.5. UCTB.model.HM module¶
5.4.6. UCTB.model.STMeta module¶
-
class
UCTB.model.STMeta.
STMeta
(num_node, external_dim, closeness_len, period_len, trend_len, num_graph=1, gcn_k=1, gcn_layers=1, gclstm_layers=1, num_hidden_units=64, num_dense_units=32, graph_merge_gal_units=32, graph_merge_gal_num_heads=2, temporal_merge_gal_units=64, temporal_merge_gal_num_heads=2, st_method='GCLSTM', temporal_merge='gal', graph_merge='gal', output_activation=<function sigmoid>, lr=0.0001, code_version='STMeta-QuickStart', model_dir='model_dir', gpu_device='0', embedding_flag=True, embedding_dim=[10, 1, 5], classified_embedding=[], decay_param=None, **kwargs)¶ Bases:
UCTB.model_unit.BaseModel.BaseModel
- Parameters
num_node (int) – Number of nodes in the graph, e.g. number of stations in NYC-Bike dataset.
external_dim (int) – Dimension of the external feature, e.g. temperature and wind are two dimension.
closeness_len (int) – The length of closeness data history. The former consecutive
closeness_len
time slotsdata will be used as closeness history. (of) –
period_len (int) – The length of period data history. The data of exact same time slots in former consecutive
days will be used as period history. (period_len) –
trend_len (int) – The length of trend data history. The data of exact same time slots in former consecutive
weeks (trend_len) –
num_graph (int) – Number of graphs used in STMeta.
gcn_k (int) – The highest order of Chebyshev Polynomial approximation in GCN.
gcn_layers (int) – Number of GCN layers.
gclstm_layers (int) – Number of STRNN layers, it works on all modes of STMeta such as GCLSTM and DCRNN.
num_hidden_units (int) – Number of hidden units of RNN.
num_dense_units (int) – Number of dense units.
graph_merge_gal_units (int) – Number of units in GAL for merging different graph features. Only works when graph_merge=’gal’
graph_merge_gal_num_heads (int) – Number of heads in GAL for merging different graph features. Only works when graph_merge=’gal’
temporal_merge_gal_units (int) – Number of units in GAL for merging different temporal features. Only works when temporal_merge=’gal’
temporal_merge_gal_num_heads (int) – Number of heads in GAL for merging different temporal features. Only works when temporal_merge=’gal’
st_method (str) – must in [‘GCLSTM’, ‘DCRNN’, ‘GRU’, ‘LSTM’], which refers to different spatial-temporal modeling methods. ‘GCLSTM’: GCN for modeling spatial feature, LSTM for modeling temporal feature. ‘DCRNN’: Diffusion Convolution for modeling spatial feature, GRU for modeling temporam frature. ‘GRU’: Ignore the spatial, and model the temporal feature using GRU ‘LSTM’: Ignore the spatial, and model the temporal feature using LSTM
temporal_merge (str) – must in [‘gal’, ‘concat’], refers to different temporal merging methods, ‘gal’: merge using GAL, ‘concat’: merge by concat and dense
graph_merge (str) – must in [‘gal’, ‘concat’], refers to different graph merging methods, ‘gal’: merge using GAL, ‘concat’: merge by concat and dense
output_activation (function) – activation function, e.g. tf.nn.tanh
lr (float) – Learning rate. Default: 1e-5
code_version (str) – Current version of this model code, which will be used as filename for saving the model
model_dir (str) – The directory to store model files. Default:’model_dir’.
gpu_device (str) – To specify the GPU to use. Default: ‘0’.
decay_param= (str) – The file path of decay function parameter. If set None, using fixed lr. default: None.
-
build
(init_vars=True, max_to_keep=5)¶ Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.
5.4.7. UCTB.model.ST_MGCN module¶
-
class
UCTB.model.ST_MGCN.
ST_MGCN
(T, input_dim, num_graph, gcl_k, gcl_l, lstm_units, lstm_layers, lr, external_dim, code_version, model_dir, gpu_device)¶ Bases:
UCTB.model_unit.BaseModel.BaseModel
References
- Parameters
T (int) – Input sequence length
input_dim (int) – Input feature dimension
num_graph (int) – Number of graphs used in the model.
gcl_k (int) – The highest order of Chebyshev Polynomial approximation in GCN.
gcl_l (int) – Number of GCN layers.
lstm_units (int) – Number of hidden units of RNN.
lstm_layers (int) – Number of LSTM layers.
lr (float) – Learning rate
external_dim (int) – Dimension of the external feature, e.g. temperature and wind are two dimension.
code_version (str) – Current version of this model code, which will be used as filename for saving the model
model_dir (str) – The directory to store model files. Default:’model_dir’.
gpu_device (str) – To specify the GPU to use. Default: ‘0’.
-
build
(init_vars=True, max_to_keep=5)¶ Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.
5.4.8. UCTB.model.ST_ResNet module¶
-
class
UCTB.model.ST_ResNet.
ST_ResNet
(width, height, external_dim, closeness_len, period_len, trend_len, num_residual_unit=4, kernel_size=3, lr=5e-05, model_dir='model_dir', code_version='QuickStart', conv_filters=64, gpu_device='0')¶ Bases:
UCTB.model_unit.BaseModel.BaseModel
ST-ResNet is a deep-learning model with an end-to-end structure based on unique properties of spatio-temporal data making use of convolution and residual units.
References
- Parameters
width (int) – The width of grid data.
height (int) – The height of grid data.
externai_dim (int) – Number of dimensions of external data.
closeness_len (int) – The length of closeness data history. The former consecutive
closeness_len
time slots of data will be used as closeness history.period_len (int) – The length of period data history. The data of exact same time slots in former consecutive
period_len
days will be used as period history.trend_len (int) – The length of trend data history. The data of exact same time slots in former consecutive
trend_len
weeks (every seven days) will be used as trend history.num_residual_unit (int) – Number of residual units. Default: 4
kernel_size (int) – Kernel size in Convolutional neural networks. Default: 3
lr (float) – Learning rate. Default: 1e-5
code_version (str) – Current version of this model code.
model_dir (str) – The directory to store model files. Default:’model_dir’
conv_filters (int) – the Number of filters in the convolution. Default: 64
gpu_device (str) – To specify the GPU to use. Default: ‘0’
-
build
()¶ Args init_vars(bool): auto init the parameters if set to True, else no parameters will be initialized. max_to_keep: max file to keep, which equals to max_to_keep in tf.train.Saver.
5.4.9. UCTB.model.XGBoost module¶
-
class
UCTB.model.XGBoost.
XGBoost
(n_estimators=10, max_depth=5, verbosity=0, objective='reg:squarederror', eval_metric='rmse')¶ Bases:
object
XGBoost is an optimized distributed gradient boosting machine learning algorithm.
- Parameters
*n_estimators (int) – Number of boosting iterations. Default: 10
*max_depth (int) – Maximum tree depth for base learners. Default: 5
*verbosity (int) – The degree of verbosity. Valid values are 0 (silent) - 3 (debug). Default: 0
*objective (string or callable) – Specify the learning task and the corresponding learning objective or a custom objective function to be used. Default:
'reg:squarederror'
*eval_metric (str, list of str, or callable, optional) – If a str, should be a built-in evaluation metric to use. See more in API Reference of XGBoost Library. Default:
'rmse'
-
fit
(X, y)¶ Training method.
- Parameters
X (np.ndarray/scipy.sparse/pd.DataFrame/dt.Frame) – The training input samples.
y (np.ndarray, optional) – The target values of training samples.
-
predict
(X)¶ Prediction method.
- Returns
Predicted values with shape as [time_slot_num, node_num, 1].
- Return type
np.ndarray