theanets.layers.recurrent.ARRNN

class theanets.layers.recurrent.ARRNN(**kwargs)

An adaptive-rate RNN defines per-hidden-unit accumulation rates.

In a normal RNN, a hidden unit is updated completely at each time step, \(h_t = f(x_t, h_{t-1})\). With an explicit update rate, the state of a hidden unit is computed as a mixture of the new and old values, h_t = alpha_t h_{t-1} + (1 - alpha_t) f(x_t, h_{t-1}).

Rates might be defined in a number of ways, spanning a continuum between vanilla RNNs (i.e., all rate parameters are fixed at 1) all the way to parametric rates that are computed as a function of the inputs and the hidden state at each time step (i.e., something more like the gated recurrent unit).

In the ARRNN model, the rate values are represented as a computed at each time step as a logistic sigmoid applied to an affine transform of the input: \(\alpha_t = 1 / (1 + e^{-x_t W_{xr} - b_r})\). This representation of the rates uses more parameters than the LRRNN but is able to adapt rates to the input at each time step. However, in this model, rates are not able to adapt to the state of the hidden units at each time step.

__init__(**kwargs)

Methods

__init__(**kwargs)
add_weights(name, nin, nout[, mean, std, ...]) Helper method to create a new weight matrix.
initial_state(name, batch_size) Return an array of suitable for representing initial state.
setup() Set up the parameters and initial values for this layer.
transform(inputs) Transform the inputs for this layer into an output for the layer.

Attributes

input_size For networks with one input, get the input size.
num_params Total number of learnable parameters in this layer.
params A list of all parameters in this layer.
setup()

Set up the parameters and initial values for this layer.

transform(inputs)

Transform the inputs for this layer into an output for the layer.

Parameters:

inputs : dict of theano expressions

Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. See base.Layer.connect().

Returns:

outputs : theano expression

A map from string output names to Theano expressions for the outputs from this layer. This layer type generates a “pre” output that gives the unit activity before applying the layer’s activation function, a “hid” output that gives the rate-independent, post-activation hidden state, a “rate” output that gives the rate value for each hidden unit, and an “out” output that gives the hidden output.

updates : list of update pairs

A sequence of updates to apply inside a theano function.