theanets.regularizers.BernoulliDropout¶
-
class
theanets.regularizers.
BernoulliDropout
(pattern='*:out', weight=0.0, rng=13)[source]¶ Randomly set activations of a layer output to zero.
Parameters: - rng : Theano random number generator, optional
A Theano random number generator to use for creating noise and dropout values. If not provided, a new generator will be produced for this layer.
Notes
This regularizer implements the
modify_graph()
method to “inject” multiplicative Bernoulli noise into the loss function of a network.Suppose we were optimizing a linear
regression
model with one hidden layer under a mean squared error. The loss for an input/output pair \((x, y)\) would be:\[\mathcal{L} = \| V(Wx + b) + c - y \|_2^2\]where \(W (V)\) and \(b (c)\) are the weights and bias parameters of the first (and second) layers in the model.
If we regularized this model’s input with multiplicative Bernoulli “noise,” the loss for this pair would be:
\[\mathcal{L} = \| V(W(x\cdot\rho) + b) + c - y \|_2^2\]where \(\rho \sim \mathcal{B}(p)\) is a vector of independent Bernoulli samples with probability \(p\).
This regularizer encourages the model to develop parameter settings such that internal features are independent. Dropout is widely used as a powerful regularizer in many types of neural network models.
References
[Hin12] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, & R. R. Salakhutdinov. (2012). “Improving neural networks by preventing co-adaptation of feature detectors.” http://arxiv.org/pdf/1207.0580.pdf Examples
This regularizer can be specified at training or test time by providing the
dropout
orinput_dropout
orhidden_dropout
keyword arguments:>>> net = theanets.Regression(...)
To apply this regularizer at training time to network inputs:
>>> net.train(..., input_dropout=0.1)
And to apply the regularizer to hidden states of the network:
>>> net.train(..., hidden_dropout=0.1)
To target specific network outputs, a pattern can be given manually:
>>> net.train(..., dropout={'hid[23]:out': 0.1, 'in:out': 0.01})
To use this regularizer when running the model forward to generate a prediction:
>>> net.predict(..., input_dropout=0.1)
The value associated with the
input_dropout
orhidden_dropout
keyword arguments should be a scalar giving the probability of the dropout to apply. The value of thedropout
keyword argument should be a dictionary, whose keys provide glob-style output name patterns, and the corresponding values are the dropout level.-
__init__
(pattern='*:out', weight=0.0, rng=13)[source]¶ x.__init__(…) initializes x; see help(type(x)) for signature
Methods
__init__
([pattern, weight, rng])x.__init__(…) initializes x; see help(type(x)) for signature log
()Log some diagnostic info about this regularizer. loss
(layers, outputs)Compute a scalar term to add to the loss function for a model. modify_graph
(outputs)Modify the outputs of a particular layer in the computation graph. -
modify_graph
(outputs)[source]¶ Modify the outputs of a particular layer in the computation graph.
Parameters: - outputs : dict of Theano expressions
A map from string output names to the corresponding Theano expression. This dictionary contains the fully-scoped name of all outputs from a single layer in the computation graph.
This map is mutable, so any changes that the regularizer makes will be retained when the caller regains control.
Notes
This method is applied during graph-construction time to change the behavior of one or more layer outputs. For example, the
BernoulliDropout
class replaces matching outputs with an expression containing “masked” outputs, where some elements are randomly set to zero each time the expression is evaluated.Any regularizer that needs to modify the structure of the computation graph should implement this method.