theanets.layers.recurrent.RRNN¶

class
theanets.layers.recurrent.
RRNN
(rate='matrix', **kwargs)¶ An RNN with an update rate for each unit.
Parameters: rate : str, optional
This parameter controls how rates are represented in the layer. If this is
None
, the default, then rates are computed as a function of the input at each time step. If this parameter is'vector'
, then rates are represented as a single vector of learnable rates. If this parameter is'uniform'
then rates are chosen randomly at uniform from the open interval (0, 1). If this parameter is'log'
then rates are chosen randomly from a loguniform distribution such that few rates are near 0 and many rates are near 1.Notes
In a normal RNN, a hidden unit is updated completely at each time step, \(h_t = f(x_t, h_{t1})\). With an explicit update rate, the state of a hidden unit is computed as a mixture of the new and old values,
\[h_t = (1  z_t) \odot h_{t1} + z_t \odot f(x_t, h_{t1})\]where \(\odot\) indicates elementwise multiplication.
Rates might be defined in a number of ways, spanning a continuum between vanilla RNNs (i.e., all rate parameters are effectively fixed at 1), fixed but nonuniform rates for each hidden unit [Ben12], parametric rates that are dependent only on the input, all the way to parametric rates that are computed as a function of the inputs and the hidden state at each time step (i.e., something more like the
gated recurrent unit
).This class represents rates in different ways depending on the value of the
rate
parameter at inititialization.Parameters
b
— vector of bias values for each hidden unitxh
— matrix connecting inputs to hidden unitshh
— matrix connecting hiddens to hiddens
If
rate
is initialized to the string'vector'
, we define:r
— vector of rates for each hidden unit
If
rate
is initialized toNone
, we define:r
— vector of rate bias values for each hidden unitxr
— matrix connecting inputs to rate values for each hidden unit
Outputs
out
— the postactivation state of the layerpre
— the preactivation state of the layerhid
— the preratemixing hidden staterate
— the rate values
References
[Ben12] (1, 2) Y. Bengio, N. BoulangerLewandowski, & R. Pascanu. (2012) “Advances in Optimizing Recurrent Networks.” http://arxiv.org/abs/1212.0901 [Jag07] H. Jaeger, M. Lukoševičius, D. Popovici, & U. Siewert. (2007) “Optimization and applications of echo state networks with leakyintegrator neurons.” Neural Networks, 20(3):335–352. 
__init__
(rate='matrix', **kwargs)¶
Methods
__init__
([rate])add_bias
(name, size[, mean, std])Helper method to create a new bias vector. add_weights
(name, nin, nout[, mean, std, ...])Helper method to create a new weight matrix. connect
(inputs)Create Theano variables representing the outputs of this layer. find
(key)Get a shared variable for a parameter by name. initial_state
(name, batch_size)Return an array of suitable for representing initial state. log
()Log some information about this layer. output_name
([name])Return a fullyscoped name for the given layer output. setup
()Set up the parameters and initial values for this layer. to_spec
()Create a specification dictionary for this layer. transform
(inputs)Transform the inputs for this layer into an output for the layer. Attributes
input_size
For networks with one input, get the input size. num_params
Total number of learnable parameters in this layer. params
A list of all parameters in this layer. 
setup
()¶ Set up the parameters and initial values for this layer.

transform
(inputs)¶ Transform the inputs for this layer into an output for the layer.
Parameters: inputs : dict of theano expressions
Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. See
base.Layer.connect()
.Returns: outputs : theano expression
A map from string output names to Theano expressions for the outputs from this layer. This layer type generates a “pre” output that gives the unit activity before applying the layer’s activation function, a “hid” output that gives the rateindependent, postactivation hidden state, a “rate” output that gives the rate value for each hidden unit, and an “out” output that gives the hidden output.
updates : list of update pairs
A sequence of updates to apply inside a theano function.