theanets.regularizers.WeightL2

class theanets.regularizers.WeightL2(pattern=None, weight=0.0)

Decay the weights in a model using an L2 norm penalty.

Notes

This regularizer implements the loss() method to add the following term to the network’s loss function:

\[\frac{1}{|\Omega|} \sum_{i \in \Omega} \|W_i\|_F^2\]

where \(\Omega\) is a set of “matching” weight parameters, and :math`|cdot|_F` is the Frobenius norm (sum of squared elements).

This regularizer tends to prevent the weights in a model from getting “too large.” Large weights are often associated with overfitting in a model, so the regularizer tends to help prevent overfitting.

References

[Moo95]J. Moody, S. Hanson, A. Krogh, & J. A. Hertz. (1995). “A simple weight decay can improve generalization.” NIPS 4, 950-957.

Examples

This regularizer can be specified at training or test time by providing the weight_l2 or weight_decay keyword arguments:

>>> net = theanets.Regression(...)

To use this regularizer at training time:

>>> net.train(..., weight_decay=0.1)

By default all (2-dimensional) weights in the model are penalized. To include only some weights:

>>> net.train(..., weight_decay=dict(weight=0.1, pattern='hid[23].w'))

To use this regularizer when running the model forward to generate a prediction:

>>> net.predict(..., weight_decay=0.1)

The value associated with the keyword argument can be a scalar—in which case it provides the weight for the regularizer—or a dictionary, in which case it will be passed as keyword arguments directly to the constructor.

__init__(pattern=None, weight=0.0)

Methods

__init__([pattern, weight])
log() Log some diagnostic info about this regularizer.
loss(layers, outputs)
modify_graph(outputs) Modify the outputs of a particular layer in the computation graph.