theanets.layers.base.Layer

class theanets.layers.base.Layer(name=None, **kwargs)[source]

Base class for network layers.

In theanets, a layer refers to a logically grouped set of parameters and computations. Typically this encompasses a set of weight matrix and bias vector parameters, plus the “output” units that produce a signal for further layers to consume.

Subclasses of this class can be created to implement many different forms of layer-specific computation. For example, a vanilla Feedforward layer accepts input from the “preceding” layer in a network, computes an affine transformation of that input and applies a pointwise transfer function. On the other hand, a Recurrent layer computes an affine transformation of the current input, and combines that with information about the state of the layer at previous time steps.

Most subclasses will need to provide an implementation of the setup() method, which creates the parameters needed by the layer, and the transform() method, which converts the Theano input expressions coming in to the layer into some output expression(s).

Parameters:
size : int, optional

If provided, give the size of outputs from this layer. Providing both size and shape will raise an exception – in general, use shape whenever the architecture uses convolutions, and size for other cases.

shape : tuple of int, optional

If provided, define the shape of the values generated by this layer. The shape is given for a single data element (not a batch of such elements), so for a vanilla feedforward layer, this would be (size, ), while a convolutional layer might have shape (width, height, channels).

inputs : str or tuple of str, optional

Name(s) of input(s) to this layer. These names must be resolved to layers by binding the layer inside a network graph. Defaults to an empty tuple; in practice this needs to be provided for most layers.

name : str, optional

The name of this layer. If not given, layers will be numbered sequentially based on the order in which they are created.

activation : str, optional

The name of an activation function to use for units in this layer. See build_activation().

rng : numpy.random.RandomState or int, optional

A numpy random number generator, or an integer seed for a random number generator. If not provided, the random number generator will be created with an automatically chosen seed.

mean, mean_XYZ : float, optional

Initialize parameters for this layer to have the given mean value. If mean_XYZ is specified, it will apply only to the parameter named XYZ. Defaults to 0.

std, std_XYZ : float, optional

Initialize parameters for this layer to have the given standard deviation. If std_XYZ is specified, only the parameter named XYZ will be so initialized. Defaults to 0.

sparsity, sparsity_XYZ : float in (0, 1), optional

If given, create sparse connections in the layer’s weight matrix, such that this fraction of the weights is set to zero. If sparsity_XYZ is given, it will apply only the parameter with name XYZ. By default, this parameter is 0, meaning all weights are nonzero.

diagonal, diagonal_XYZ : float, optional

If given, create initial parameter matrices for this layer that are initialized to diagonal matrices with this value along the diagonal. Defaults to None, which initializes all weights using random values.

Attributes:
name : str

Name of this layer.

activate : callable

The activation function to use on this layer’s outputs.

kwargs : dict

Additional keyword arguments used when constructing this layer.

__init__(name=None, **kwargs)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

Methods

__init__([name]) x.__init__(…) initializes x; see help(type(x)) for signature
add_bias(name, size[, mean, std]) Helper method to create a new bias vector.
add_weights(name, nin, nout[, mean, std, …]) Helper method to create a new weight matrix.
bind(graph[, reset, initialize]) Bind this layer into a computation graph.
connect(inputs) Create Theano variables representing the outputs of this layer.
find(key) Get a shared variable for a parameter by name.
full_name(name) Return a fully-scoped name for the given layer output.
log() Log some information about this layer.
log_params() Log information about this layer’s parameters.
resolve_inputs(layers) Resolve the names of inputs for this layer into shape tuples.
resolve_outputs() Resolve the names of outputs for this layer into shape tuples.
setup() Set up the parameters and initial values for this layer.
to_spec() Create a specification dictionary for this layer.
transform(inputs) Transform the inputs for this layer into an output for the layer.

Attributes

input_name Name of layer input (for layers with one input).
input_shape Shape of layer input (for layers with one input).
input_size Size of layer input (for layers with one input).
output_name Full name of the default output for this layer.
output_shape Shape of default output from this layer.
output_size Number of “neurons” in this layer’s default output.
params A list of all parameters in this layer.
add_bias(name, size, mean=0, std=1)[source]

Helper method to create a new bias vector.

Parameters:
name : str

Name of the parameter to add.

size : int

Size of the bias vector.

mean : float, optional

Mean value for randomly-initialized biases. Defaults to 0.

std : float, optional

Standard deviation for randomly-initialized biases. Defaults to 1.

add_weights(name, nin, nout, mean=0, std=0, sparsity=0, diagonal=0)[source]

Helper method to create a new weight matrix.

Parameters:
name : str

Name of the parameter to add.

nin : int

Size of “input” for this weight matrix.

nout : int

Size of “output” for this weight matrix.

mean : float, optional

Mean value for randomly-initialized weights. Defaults to 0.

std : float, optional

Standard deviation of initial matrix values. Defaults to \(1 / sqrt(n_i + n_o)\).

sparsity : float, optional

Fraction of weights to be set to zero. Defaults to 0.

diagonal : float, optional

Initialize weights to a matrix of zeros with this value along the diagonal. Defaults to None, which initializes all weights randomly.

bind(graph, reset=True, initialize=True)[source]

Bind this layer into a computation graph.

This method is a wrapper for performing common initialization tasks. It calls resolve(), setup(), and log().

Parameters:
graph : Network

A computation network in which this layer is to be bound.

reset : bool, optional

If True (the default), reset the resolved layers for this layer.

initialize : bool, optional

If True (the default), initialize the parameters for this layer by calling setup().

Raises:
theanets.util.ConfigurationError :

If an input cannot be resolved.

connect(inputs)[source]

Create Theano variables representing the outputs of this layer.

Parameters:
inputs : dict of Theano expressions

Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. Each string key should be of the form “{layer_name}:{output_name}” and refers to a specific output from a specific layer in the graph.

Returns:
outputs : dict

A dictionary mapping names to Theano expressions for the outputs from this layer.

updates : sequence of (parameter, expression) tuples

Updates that should be performed by a Theano function that computes something using this layer.

find(key)[source]

Get a shared variable for a parameter by name.

Parameters:
key : str or int

The name of the parameter to look up, or the index of the parameter in our parameter list. These are both dependent on the implementation of the layer.

Returns:
param : shared variable

A shared variable containing values for the given parameter.

Raises:
KeyError

If a param with the given name does not exist.

full_name(name)[source]

Return a fully-scoped name for the given layer output.

Parameters:
name : str

Name of an output for this layer.

Returns:
scoped : str

A fully-scoped name for the given output from this layer.

input_name

Name of layer input (for layers with one input).

input_shape

Shape of layer input (for layers with one input).

input_size

Size of layer input (for layers with one input).

log()[source]

Log some information about this layer.

log_params()[source]

Log information about this layer’s parameters.

output_name

Full name of the default output for this layer.

output_shape

Shape of default output from this layer.

output_size

Number of “neurons” in this layer’s default output.

params

A list of all parameters in this layer.

resolve_inputs(layers)[source]

Resolve the names of inputs for this layer into shape tuples.

Parameters:
layers : list of Layer

A list of the layers that are available for resolving inputs.

Raises:
theanets.util.ConfigurationError :

If an input cannot be resolved.

resolve_outputs()[source]

Resolve the names of outputs for this layer into shape tuples.

setup()[source]

Set up the parameters and initial values for this layer.

to_spec()[source]

Create a specification dictionary for this layer.

Returns:
spec : dict

A dictionary specifying the configuration of this layer.

transform(inputs)[source]

Transform the inputs for this layer into an output for the layer.

Parameters:
inputs : dict of Theano expressions

Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. See Layer.connect().

Returns:
output : Theano expression

The output for this layer is the same as the input.

updates : list

An empty updates list.