theanets.layers.recurrent.Clockwork

class theanets.layers.recurrent.Clockwork(periods, **kwargs)

A Clockwork RNN layer updates “modules” of neurons at specific rates.

In a vanilla RNN layer, all neurons in the hidden pool are updated at every time step by mixing an affine transformation of the input with an affine transformation of the state of the hidden pool neurons at the previous time step:

\[h_t = g(x_tW_{xh} + h_{t-1}W_{hh} + b_h)\]

In a Clockwork RNN layer, neurons in the hidden pool are split into \(M\) “modules” of equal size (\(h^i\) for \(i = 1, \dots, M\)), each of which has an associated clock period (a positive integer \(T_i\) for \(i = 1, \dots, M\)). The neurons in module \(i\) are updated only when the time index \(t\) of the input \(x_t\) is an even multiple of \(T_i\). Thus some of modules (those with large \(T\)) only respond to “slow” features in the input, and others (those with small \(T\)) respond to “fast” features.

Furthermore, “fast” modules with small periods receive inputs from “slow” modules with large periods, but not vice-versa: this allows the “slow” features to influence the “fast” features, but not the other way around.

The state \(h_t^i\) of module \(i\) at time step \(t\) is thus governed by the following mathematical relation:

\[\begin{split}h_t^i = \left\{ \begin{align*} &g\left( x_tW_{xh}^i + b_h^i + \sum_{j=1}^i h_{t-1}^jW_{hh}^j\right) \mbox{ if } t \mod T_i = 0 \\ &h_{t-1}^i \mbox{ otherwise.} \end{align*} \right.\end{split}\]

Here, the modules have been ordered such that \(T_j > T_i\) for \(j < i\).

In theanets, this update relation is implemented using a nested loop. The outer loop calls Theano’s scan() operator to iterate over the input data at each time step. The inner loop iterates over the modules, updating each module if the clock cycle is correct, and copying over the previous value of the module if not.

Parameters:

periods : sequence of int

The periods for the modules in this clockwork layer. The number of values in this sequence specifies the number of modules in the layer. The layer size must be an integer multiple of the number of modules given in this sequence.

References

[R1]J. Koutník, K. Greff, F. Gomez, & J. Schmidhuber. (2014) “A Clockwork RNN.” http://arxiv.org/abs/1402.3511
__init__(periods, **kwargs)

Methods

__init__(periods, **kwargs)
add_weights(name, nin, nout[, mean, std, ...]) Helper method to create a new weight matrix.
initial_state(name, batch_size) Return an array of suitable for representing initial state.
log() Log some information about this layer.
setup()
to_spec() Create a specification dictionary for this layer.
transform(inputs) Transform inputs to this layer into outputs for the layer.

Attributes

input_size For networks with one input, get the input size.
num_params Total number of learnable parameters in this layer.
params A list of all parameters in this layer.
log()

Log some information about this layer.

to_spec()

Create a specification dictionary for this layer.

Returns:

spec : dict

A dictionary specifying the configuration of this layer.

transform(inputs)

Transform inputs to this layer into outputs for the layer.

Parameters:

inputs : dict of theano expressions

Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. See base.Layer.connect().

Returns:

outputs : dict of theano expressions

A map from string output names to Theano expressions for the outputs from this layer. This layer type generates a “pre” output that gives the unit activity before applying the layer’s activation function, and a “hid” output that gives the post-activation values.

updates : sequence of update pairs

A sequence of updates to apply to this layer’s state inside a theano function.