5.6. UCTB.train package

5.6.1. UCTB.train.EarlyStopping module

class UCTB.train.EarlyStopping.EarlyStopping(patience)

Bases: object

Early stop if a span of newest records are not better than the current best record.

Parameters

patience (int) – The span of checked newest records.

__record_list

List of records.

Type

list

__best

The current best record.

Type

float

__patience

The span of checked newest records.

Type

int

__p

The number of newest records that are worse than the current best record.

Type

int

stop(new_value)

Append the new record to the record list and check if the number of new records than are worse than the best records exceeds the limit.

Parameters

new_value (float) – The new record generated by the newest model.

Returns

True if the number of new records than are worse than the best records exceeds the limit and triggers early stop, otherwise False.

Return type

bool

class UCTB.train.EarlyStopping.EarlyStoppingTTest(length, p_value_threshold)

Bases: object

Early Stop by t-test.

T-test is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. This method takes two intervals according to length in the record list and see if they have identical average values. If so, do early stop.

Parameters
  • length (int) – The length of checked interval.

  • p_value_threshold (float) – The p-value threshold to decide whether to do early stop.

__record_list

List of records.

Type

list

__best

The current best record.

Type

float

__test_length

The length of checked interval.

Type

int

__p_value_threshold

The p-value threshold to decide whether to do early stop.

Type

float

stop(new_value)

Take two intervals in the record list to do t-test.

Parameters

new_value (float) – The new record generated by the newest model.

Returns

True if p value of t-test is smaller than threshold and triggers early stop, otherwise False.

Return type

bool

5.6.2. UCTB.train.MiniBatchTrain module

class UCTB.train.MiniBatchTrain.MiniBatchFeedDict(feed_dict, sequence_length, batch_size, shuffle=True)

Bases: object

Get small batches of data from dict for training at once.

Parameters
  • feed_dict (dict) – Data dictionary consisting of key-value pairs.

  • sequence_length (int) – Only divide value in feed_dict whose length is equal to sequence_length into several batches.

  • batch_size (int) – The number of data for one training session.

  • shuffle (bool) – If set True, the input dict will be shuffled. default:True.

get_batch()

For the value in feed_dict whose length is equal to sequence_length, divide the value into several batches, and return one batch in order each time. For those whose length is not equal to sequence_length, do not change `value`and return it directly. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()

Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(data)
class UCTB.train.MiniBatchTrain.MiniBatchTrain(X, Y, batch_size)

Bases: object

Get small batches of data for training at once.

Parameters
  • X (ndarray) – Input features. The first dimension of X should be sample size.

  • Y (ndarray) – Target values. The first dimension of Y should be sample size.

  • batch_size (int) – The number of data for one training session.

get_batch()

Returns a batch of X, Y pairs each time. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()

Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(X, Y)

Input (X, Y) pairs, shuffle and return it.

class UCTB.train.MiniBatchTrain.MiniBatchTrainMultiData(data, batch_size, shuffle=True)

Bases: object

Get small batches of data for training at once.

Parameters
  • data (ndarray) – Input data. Its first dimension should be sample size.

  • batch_size (int) – The number of data for one training session.

  • shuffle (bool) – If set True, the input data will be shuffled. default:True.

get_batch()

Returns a batch of data each time. There are internal variables to record the number of batches currently generated. When the last data is not enough to generate a batch, a batch of data from the tail is returned.

restart()

Set the variable that records the number of batches currently generated to 0, so that we can call the get_batch method to generate training data in batches from scratch.

static shuffle(data)