5.2. UCTB.preprocess package

5.2.1. UCTB.preprocess.preprocessor module

class UCTB.preprocess.preprocessor.Normalizer(X)

Bases: object

This class can help normalize and denormalize data by calling min_max_normal and min_max_denormal method.

min_max_denormal(X)

Input X, return denormalized results. :type: numpy.ndarray

min_max_normal(X)

Input X, return normalized results. :type: numpy.ndarray

class UCTB.preprocess.preprocessor.ST_MoveSample(closeness_len, period_len, trend_len, target_length=1, daily_slots=24)

Bases: object

This class can converts raw data into temporal features including closenss, period and trend features.

Parameters
  • closeness_len (int) – The length of closeness data history. The former consecutive closeness_len time slots of data will be used as closeness history.

  • period_len (int) – The length of period data history. The data of exact same time slots in former consecutive period_len days will be used as period history.

  • trend_len (int) – The length of trend data history. The data of exact same time slots in former consecutive trend_len weeks (every seven days) will be used as trend history.

  • target_length (int) – The numbers of steps that need prediction by one piece of history data. Have to be 1 now. Default: 1 default:1.

  • daily_slots (int) – The number of records of one day. Calculated by 24 * 60 /time_fitness. default:24.

move_sample(data)

Input data to generate closeness, period, trend features and target vector y.

Parameters

data (ndarray) – Orginal temporal data.

:return:closeness, period, trend and y matrices. :type: numpy.ndarray.

class UCTB.preprocess.preprocessor.SplitData

Bases: object

This class can help split data by calling split_data and split_feed_dict method.

static split_data(data, ratio_list)

Divide the data based on the given parameter ratio_list.

Parameters
  • data (ndarray) – Data to be split.

  • ratio_list (list) – Split ratio, the data will be split according to the ratio.

:return:The elements in the returned list are the divided data, and the

dimensions of the list are the same as ratio_list.

Type

list

static split_feed_dict(feed_dict, sequence_length, ratio_list)

Divide the value data in feed_dict based on the given parameter ratio_list.

Parameters
  • feed_dict (dict) – It is a dictionary composed of key-value pairs.

  • sequence_length (int) – If the length of value in feed_dict is equal to sequence_length, then this method divides the value according to the ratio without changing its key.

  • ratio_list (list) – Split ratio, the data will be split according to the ratio.

Returns

The elements in the returned list are divided dictionaries, and the dimensions of the list are the same as ratio_list.

Type

list

5.2.2. UCTB.preprocess.time_utils module

UCTB.preprocess.time_utils.is_valid_date(date_str)
Parameters

date_str (string) – e.g. 2019-01-01

Returns

True if date_str is valid date, otherwise return False.

UCTB.preprocess.time_utils.is_work_day_america(date)
Parameters

date (string or datetime) – e.g. 2019-01-01

Returns

True if date is not holiday in America, otherwise return False.

UCTB.preprocess.time_utils.is_work_day_china(date)
Parameters

date (string or datetime) – e.g. 2019-01-01

Returns

True if date is not holiday in China, otherwise return False.