theanets.regularizers.RecurrentNorm

class theanets.regularizers.RecurrentNorm(pattern='*', weight=0.0)[source]

Penalize successive activation norms of recurrent layers.

Notes

This regularizer implements the loss() method to add the following term to a recurrent network’s loss function:

\[\frac{1}{T|\Omega|} \sum_{i \in \Omega} \sum_{t=1}^T \left( \|Z_i^t\|_2^2 - \|Z_i^{t-1}\|_2^2 \right)^2\]

where \(\Omega\) is a set of “matching” graph output indices, and the squared L2 norm :math`|cdot|_2^2` is the sum of the squares of the elements in the corresponding array.

This regularizer encourages the norms of the hidden state activations in a recurrent layer to remain constant over time.

References

[Kru15]D. Krueger & R. Memisevic. (ICLR 2016?) “Regularizing RNNs by Stabilizing Activations.” http://arxiv.org/abs/1511.08400

Examples

This regularizer can be specified at training or test time by providing the recurrent_norm keyword argument:

>>> net = theanets.Regression(...)

To use this regularizer at training time:

>>> net.train(..., recurrent_norm=dict(weight=0.1, pattern='hid3:out'))

A pattern must be provided; this pattern will match against all outputs in the computation graph, so some care must be taken to ensure that the regularizer is applied only to specific layer outputs.

To use this regularizer when running the model forward to generate a prediction:

>>> net.predict(..., recurrent_norm=0.1)

The value associated with the keyword argument can be a scalar—in which case it provides the weight for the regularizer—or a dictionary, in which case it will be passed as keyword arguments directly to the constructor.

__init__(pattern='*', weight=0.0)

x.__init__(…) initializes x; see help(type(x)) for signature

Methods

__init__([pattern, weight]) x.__init__(…) initializes x; see help(type(x)) for signature
log() Log some diagnostic info about this regularizer.
loss(layers, outputs) Compute a scalar term to add to the loss function for a model.
modify_graph(outputs) Modify the outputs of a particular layer in the computation graph.
loss(layers, outputs)[source]

Compute a scalar term to add to the loss function for a model.

Parameters:
layers : list of theanets.layers.Layer

A list of the layers in the model being regularized.

outputs : dict of Theano expressions

A dictionary mapping string expression names to their corresponding Theano expressions in the computation graph. This dictionary contains the fully-scoped name of every layer output in the graph.