theanets.regularizers.Contractive¶
-
class
theanets.regularizers.
Contractive
(pattern=None, weight=0.0, wrt='*')¶ Penalize the derivative of hidden layers with respect to their inputs.
Parameters: wrt : str, optional
A glob-style pattern that specifies the inputs with respect to which the derivative should be computed. Defaults to
'*'
, which matches all inputs.Notes
This regularizer implements the
loss()
method to add the following term to the network’s loss function:\[\frac{1}{|\Omega|} \sum_{i \in \Omega} \|\frac{\partial Z_i}{x}\|_F^2\]where \(\Omega\) is a set of “matching” graph output indices, \(Z_i\) is the output of network graph \(i\), \(x\) is the input to the network graph, and :math`|cdot|_F` is the Frobenius norm (sum of the squared elements in the array).
This regularizer attempts to make the derivative of the hidden representatin flat with respect to the input. In theory, this encourages the network to learn features that are insensitive to small changes in the input (that is, they are mostly perpindicular to the input manifold).
Like the
HiddenL1
regularizer, this acts indirectly to force a model to cover the space of its input dataset using as few features as possible; this pressure often causes features to be duplicated with slight variations to “tile” the input space in a very different way than a non-regularized model.References
[Rif11] S. Rifai, P. Vincent, X. Muller, X. Glorot, & Y. Bengio. (ICML 2011). “Contractive auto-encoders: Explicit invariance during feature extraction.”
http://machinelearning.wustl.edu/mlpapers/paper_files/ICML2011Rifai_455.pdf
Examples
This regularizer can be specified at training or test time by providing the
hidden_l1
orhidden_sparsity
keyword arguments:>>> net = theanets.Regression(...)
To use this regularizer at training time:
>>> net.train(..., contractive=0.1)
By default all hidden layer outputs are included. To include only some graph outputs:
>>> net.train(..., contractive=dict(weight=0.1, pattern='hid3:out', wrt='in'))
To use this regularizer when running the model forward to generate a prediction:
>>> net.predict(..., contractive=0.1)
The value associated with the keyword argument can be a scalar—in which case it provides the weight for the regularizer—or a dictionary, in which case it will be passed as keyword arguments directly to the constructor.
-
__init__
(pattern=None, weight=0.0, wrt='*')¶
Methods
__init__
([pattern, weight, wrt])log
()Log some diagnostic info about this regularizer. loss
(layer_list, outputs)modify_graph
(outputs)Modify the outputs of a particular layer in the computation graph. -
log
()¶ Log some diagnostic info about this regularizer.
-