Layers¶
In a standard feedforward neural network layer, each node \(i\) in layer \(k\) receives inputs from all nodes in layer \(k-1\), then transforms the weighted sum of these inputs:
where \(\sigma: \mathbb{R} \to \mathbb{R}\) is an activation function.
In addition to standard feedforward layers, other types of layers are also commonly used:
- For recurrent models,
recurrent layers
permit a cycle in the computation graph that depends on a previous time step. - For models that process images,
convolution layers
are common. - For some types of autoencoder models, it is common to
tie layer weights to another layer
.
Available Layers¶
This module contains classes for different types of network layers.
Layer (size, inputs[, name, activation]) |
Base class for network layers. |
Input (size[, name, ndim, sparse]) |
A layer that receives external input data. |
Concatenate (size, inputs[, name, activation]) |
Concatenate multiple inputs along the last axis. |
Flatten (size, inputs[, name, activation]) |
Flatten all but the batch index of the input. |
Product (size, inputs[, name, activation]) |
Multiply several inputs together elementwise. |
Reshape (shape, **kwargs) |
Reshape an input to have different numbers of dimensions. |
Feedforward¶
Feedforward layers for neural network computation graphs.
Classifier (**kwargs) |
A classifier layer performs a softmax over a linear input transform. |
Feedforward (size, inputs[, name, activation]) |
A feedforward neural network layer performs a transform of its input. |
Tied (partner, **kwargs) |
A tied-weights feedforward layer shadows weights from another layer. |
Convolution¶
Convolutional layers “scan” over input data.
Conv1 (filter_size[, stride, border_mode]) |
1-dimensional convolutions run over one data axis. |
Recurrent¶
Recurrent layers allow time dependencies in the computation graph.
RNN (size, inputs[, name, activation]) |
Standard recurrent network layer. |
RRNN ([rate]) |
An RNN with an update rate for each unit. |
MUT1 (size, inputs[, name, activation]) |
“MUT1” evolved recurrent layer. |
GRU (size, inputs[, name, activation]) |
Gated Recurrent Unit layer. |
LSTM (size, inputs[, name, activation]) |
Long Short-Term Memory (LSTM) layer. |
MRNN ([factors]) |
A recurrent network layer with multiplicative dynamics. |
SCRN ([rate]) |
Simple Contextual Recurrent Network layer. |
Clockwork (periods, **kwargs) |
A Clockwork RNN layer updates “modules” of neurons at specific rates. |
Bidirectional ([worker]) |
A bidirectional recurrent layer runs worker models forward and backward. |
Layer Attributes¶
Now that we’ve seen how to specify values for the attributes of each layer in
your model, we’ll look at the available attributes that can be customized. For
many of these settings, you’ll want to use a dictionary (or create a
theanets.Layer
instance yourself) to
specify non-default values.
size
: The number of “neurons” in the layer. This value must be specified by the modeler when creating the layer. It can be specified by providing an integer, or as a tuple that contains an integer.form
: A string specifying the type of layer to use (see above). This defaults to “feedforward” but can be the name of any existingtheanets.Layer
subclass (including Custom Layers that you have defined).name
: A string name for the layer. If this isn’t provided when creating a layer, the layer will be assigned a default name. The default names for the first and last layers in a network are'in'
and'out'
respectively, and the layers in between are assigned the name “hidN” where N is the number of existing layers.If you create a layer instance manually, the default name is
'layerN'
where N is the number of existing layers.activation
: A string describing the activation function to use for the layer. This defaults to'relu'
.inputs
: An integer or dictionary describing the sizes of the inputs that this layer expects. This is normally optional and defaults to the size of the preceding layer in a chain-like model. However, providing a dictionary here permits arbitrary layer interconnections. See Computation Graphs for more details.mean
: A float specifying the mean of the initial parameter values to use in the layer. Defaults to 0. This value applies to all parameters in the model that don’t have mean values specified for them directly.mean_ABC
: A float specifying the mean of the initial parameter values to use in the layer’s'ABC'
parameter. Defaults to 0. This can be used to specify the mean of the initial values used for a specific parameter in the model.std
: A float specifying the standard deviation of the initial parameter values to use in the layer. Defaults to 1. This value applies to all parameters in the model that don’t have standard deviations specified directly.std_ABC
: A float specifying the standard deviation of the initial parameter values to use in the layer’s'ABC'
parameter. Defaults to 1. This can be used to specify the standard deviation of the initial values used for a specific parameter in the model.sparsity
: A float giving the proportion of parameter values in the layer that should be initialized to zero. Nonzero values in the parameters will be drawn from a Gaussian with the specified mean and standard deviation as above, and then an appropriate number of these parameter values will randomly be reset to zero to make the parameter “sparse.”sparsity_ABC
: A float or vector of floats used to initialize the parameters in the layer’s'ABC'
parameter. This can be used to set the initial sparsity level for a particular parameter in the layer.diagonal
: A float or vector of floats used to initialize the parameters in the layer. If this is provided, weight matrices in the layer will be initialized to all zeros, with this value or values placed along the diagonal.diagonal_ABC
: A float or vector of floats used to initialize the parameters in the layer’s'ABC'
parameter. If this is provided, the relevant weight matrix in the layer will be initialized to all zeros, with this value or values placed along the diagonal.rng
: An integer ornumpy
random number generator. If specified the given random number generator will be used to create the initial values for the parameters in the layer. This can be useful for repeatable runs of a model.
In addition to these configuration values, each layer can also be provided with
keyword arguments specific to that layer. For example, the MRNN
recurrent layer type requires a factors
argument, and the Conv1
1D
convolutional layer requires a filter_size
argument.
Custom Layers¶
Layers are the real workhorse in theanets
; custom layers can be created to
do all sorts of fun stuff. To create a custom layer, just create a subclass of
theanets.Layer
and give it the
functionality you want.
As a very simple example, let’s suppose you wanted to create a normal feedforward layer but did not want to include a bias term:
import theanets
import theano.tensor as TT
class NoBias(theanets.Layer):
def transform(self, inputs):
return TT.dot(inputs, self.find('w'))
def setup(self):
self.add_weights('w', nin=self.input_size, nout=self.size)
Once you’ve set up your new layer class, it will automatically be registered and
available in theanets.Layer.build
using the name of your class:
layer = theanets.Layer.build('nobias', size=4)
or, while creating a model:
net = theanets.Autoencoder(
layers=(4, (3, 'nobias', 'linear'), (4, 'tied', 'linear')),
)
This example shows how fast it is to create a PCA-like model that will learn the subspace of your dataset that spans the most variance—the same subspace spanned by the principal components.