theanets.layers.base.Layer

class theanets.layers.base.Layer(size, inputs, name=None, activation='relu', **kwargs)

Base class for network layers.

In theanets, a layer refers to a logically grouped set of parameters and computations. Typically this encompasses a set of weight matrix and bias vector parameters, plus the “output” units that produce a signal for further layers to consume.

Subclasses of this class can be created to implement many different forms of layer-specific computation. For example, a vanilla Feedforward layer accepts input from the “preceding” layer in a network, computes an affine transformation of that input and applies a pointwise transfer function. On the other hand, a Recurrent layer computes an affine transformation of the current input, and combines that with information about the state of the layer at previous time steps.

Most subclasses will need to provide an implementation of the setup() method, which creates the parameters needed by the layer, and the transform() method, which converts the Theano input expressions coming in to the layer into some output expression(s).

Parameters:

size : int

Size of this layer.

inputs : dict or int

Size of input(s) to this layer. If one integer is provided, a single input of the given size is expected. If a dictionary is provided, it maps from output names to corresponding sizes.

name : str, optional

The name of this layer. If not given, layers will be numbered sequentially based on the order in which they are created.

activation : str, optional

The name of an activation function to use for units in this layer. See build_activation().

rng : numpy.random.RandomState or int, optional

A numpy random number generator, or an integer seed for a random number generator. If not provided, the random number generator will be created with an automatically chosen seed.

mean, mean_XYZ : float, optional

Initialize parameters for this layer to have the given mean value. If mean_XYZ is specified, it will apply only to the parameter named XYZ. Defaults to 0.

std, std_XYZ : float, optional

Initialize parameters for this layer to have the given standard deviation. If std_XYZ is specified, only the parameter named XYZ will be so initialized. Defaults to 0.

sparsity, sparsity_XYZ : float in (0, 1), optional

If given, create sparse connections in the layer’s weight matrix, such that this fraction of the weights is set to zero. If sparsity_XYZ is given, it will apply only the parameter with name XYZ. By default, this parameter is 0, meaning all weights are nonzero.

diagonal, diagonal_XYZ : float, optional

If given, create initial parameter matrices for this layer that are initialized to diagonal matrices with this value along the diagonal. Defaults to None, which initializes all weights using random values.

Attributes

name (str) Name of this layer.
size (int) Size of this layer.
inputs (dict) Dictionary mapping input names to their corresponding sizes.
activation (str) String representing the activation function for this layer.
activate (callable) The activation function to use on this layer’s outputs.
kwargs (dict) Additional keyword arguments used when constructing this layer.
__init__(size, inputs, name=None, activation='relu', **kwargs)

Methods

__init__(size, inputs[, name, activation])
add_bias(name, size[, mean, std]) Helper method to create a new bias vector.
add_weights(name, nin, nout[, mean, std, ...]) Helper method to create a new weight matrix.
connect(inputs) Create Theano variables representing the outputs of this layer.
find(key) Get a shared variable for a parameter by name.
log() Log some information about this layer.
output_name([name]) Return a fully-scoped name for the given layer output.
setup() Set up the parameters and initial values for this layer.
to_spec() Create a specification dictionary for this layer.
transform(inputs) Transform the inputs for this layer into an output for the layer.

Attributes

input_size For networks with one input, get the input size.
num_params Total number of learnable parameters in this layer.
params A list of all parameters in this layer.
add_bias(name, size, mean=0, std=1)

Helper method to create a new bias vector.

Parameters:

name : str

Name of the parameter to add.

size : int

Size of the bias vector.

mean : float, optional

Mean value for randomly-initialized biases. Defaults to 0.

std : float, optional

Standard deviation for randomly-initialized biases. Defaults to 1.

add_weights(name, nin, nout, mean=0, std=0, sparsity=0, diagonal=0)

Helper method to create a new weight matrix.

Parameters:

name : str

Name of the parameter to add.

nin : int

Size of “input” for this weight matrix.

nout : int

Size of “output” for this weight matrix.

mean : float, optional

Mean value for randomly-initialized weights. Defaults to 0.

std : float, optional

Standard deviation of initial matrix values. Defaults to \(1 / sqrt(n_i + n_o)\).

sparsity : float, optional

Fraction of weights to be set to zero. Defaults to 0.

diagonal : float, optional

Initialize weights to a matrix of zeros with this value along the diagonal. Defaults to None, which initializes all weights randomly.

connect(inputs)

Create Theano variables representing the outputs of this layer.

Parameters:

inputs : dict of Theano expressions

Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. Each string key should be of the form “{layer_name}:{output_name}” and refers to a specific output from a specific layer in the graph.

Returns:

outputs : dict

A dictionary mapping names to Theano expressions for the outputs from this layer.

updates : sequence of (parameter, expression) tuples

Updates that should be performed by a Theano function that computes something using this layer.

find(key)

Get a shared variable for a parameter by name.

Parameters:

key : str or int

The name of the parameter to look up, or the index of the parameter in our parameter list. These are both dependent on the implementation of the layer.

Returns:

param : shared variable

A shared variable containing values for the given parameter.

Raises:

KeyError

If a param with the given name does not exist.

input_size

For networks with one input, get the input size.

log()

Log some information about this layer.

num_params

Total number of learnable parameters in this layer.

output_name(name='out')

Return a fully-scoped name for the given layer output.

Parameters:

name : str

Name of an output for this layer.

Returns:

scoped : str

A fully-scoped name for the given output from this layer.

params

A list of all parameters in this layer.

setup()

Set up the parameters and initial values for this layer.

to_spec()

Create a specification dictionary for this layer.

Returns:

spec : dict

A dictionary specifying the configuration of this layer.

transform(inputs)

Transform the inputs for this layer into an output for the layer.

Parameters:

inputs : dict of Theano expressions

Symbolic inputs to this layer, given as a dictionary mapping string names to Theano expressions. See Layer.connect().

Returns:

output : Theano expression

The output for this layer is the same as the input.

updates : list

An empty updates list.