theanets.layers.recurrent.GRU¶

class theanets.layers.recurrent.GRU(h_0=None, **kwargs)[source]¶

Gated Recurrent Unit layer.

Notes

The Gated Recurrent Unit lies somewhere between the LSTM and the RRNN in complexity. Like the RRNN, its hidden state is updated at each time step to be a linear interpolation between the previous hidden state, \(h_{t-1}\), and the “target” hidden state, \(h_t\). The interpolation is modulated by an “update gate” that serves the same purpose as the rate gates in the RRNN. Like the LSTM, the target hidden state can also be reset using a dedicated gate. All gates in this layer are activated based on the current input as well as the previous hidden state.

The update equations in this layer are largely those given by [Chu14], page 4, except for the addition of a hidden bias term. They are:

\[\begin{split}\begin{eqnarray} r_t &=& \sigma(x_t W_{xr} + h_{t-1} W_{hr} + b_r) \\ z_t &=& \sigma(x_t W_{xz} + h_{t-1} W_{hz} + b_z) \\ \hat{h}_t &=& g\left(x_t W_{xh} + (r_t \odot h_{t-1}) W_{hh} + b_h\right) \\ h_t &=& (1 - z_t) \odot h_{t-1} + z_t \odot \hat{h}_t. \end{eqnarray}\end{split}\]

Here, \(g(\cdot)\) is the activation function for the layer, and \(\sigma(\cdot)\) is the logistic sigmoid, which ensures that the two gates in the layer are limited to the open interval (0, 1). The symbol \(\odot\) indicates elementwise multiplication.

Parameters

hh — matrix connecting hiddens to hiddens
hr — matrix connecting hiddens to reset gates
hz — matrix connecting hiddens to rate gates
w — matrix connecting inputs to [hidden, reset, rate] units
b — vector of bias values for [hidden, reset, rate] units

Outputs

out — the post-activation state of the layer
pre — the pre-activation state of the layer
hid — the pre-rate-mixing hidden state
rate — the rate values

References

[Chu14]

(1, 2) J. Chung, C. Gulcehre, K. H. Cho, & Y. Bengio (2014), “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling” http://arxiv.org/abs/1412.3555v1

__init__(h_0=None, **kwargs)¶: x.__init__(…) initializes x; see help(type(x)) for signature

Methods

`__init__`([h_0])	x.__init__(…) initializes x; see help(type(x)) for signature
`add_bias`(name, size[, mean, std])	Helper method to create a new bias vector.
`add_weights`(name, nin, nout[, mean, std, …])	Helper method to create a new weight matrix.
`bind`(graph[, reset, initialize])	Bind this layer into a computation graph.
`connect`(inputs)	Create Theano variables representing the outputs of this layer.
`find`(key)	Get a shared variable for a parameter by name.
`full_name`(name)	Return a fully-scoped name for the given layer output.
`log`()	Log some information about this layer.
`log_params`()	Log information about this layer’s parameters.
`resolve_inputs`(layers)	Resolve the names of inputs for this layer into shape tuples.
`resolve_outputs`()	Resolve the names of outputs for this layer into shape tuples.
`setup`()	Set up the parameters and initial values for this layer.
`to_spec`()	Create a specification dictionary for this layer.
`transform`(inputs)	Transform the inputs for this layer into an output for the layer.

Attributes

`input_name`	Name of layer input (for layers with one input).
`input_shape`	Shape of layer input (for layers with one input).
`input_size`	Size of layer input (for layers with one input).
`output_name`	Full name of the default output for this layer.
`output_shape`	Shape of default output from this layer.
`output_size`	Number of “neurons” in this layer’s default output.
`params`	A list of all parameters in this layer.

setup()[source]¶: Set up the parameters and initial values for this layer.

transform(inputs)[source]¶

Transform the inputs for this layer into an output for the layer.

Parameters:	inputs : dict of Theano expressions Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. See `Layer.connect()`.
Returns:	output : Theano expression The output for this layer is the same as the input. updates : list An empty updates list.

Theanets 0.8.0pre documentation

theanets.layers.recurrent.GRU¶

This Page