Activation Functions¶

An activation function (sometimes also called a transfer function) specifies how the final output of a layer is computed from the weighted sums of the inputs.

By default, hidden layers in theanets use a rectified linear activation function: \(g(z) = \max(0, z)\).

Output layers in theanets.Regressor and theanets.Autoencoder models use linear activations (i.e., the output is just the weighted sum of the inputs from the previous layer: \(g(z) = z\)), and the output layer in theanets.Classifier models uses a softmax activation: \(g(z) = \exp(z) / \sum\exp(z)\).

To specify a different activation function for a layer, include an activation key chosen from the table below, or create a custom activation. As described in Specifying Layers, the activation key can be included in your model specification either using the activation keyword argument in a layer dictionary, or by including the key in a tuple with the layer size:

net = theanets.Regressor([10, (10, 'tanh'), 10])

The activations that theanets provides are:

Key	Description	\(g(z) =\)
linear	linear	\(z\)
sigmoid	logistic sigmoid	\((1 + \exp(-z))^{-1}\)
logistic	logistic sigmoid	\((1 + \exp(-z))^{-1}\)
tanh	hyperbolic tangent	\(\tanh(z)\)
softplus	smooth relu approximation	\(\log(1 + \exp(z))\)
softmax	categorical distribution	\(\exp(z) / \sum\exp(z)\)
relu	rectified linear	\(\max(0, z)\)
trel	truncated rectified linear	\(\max(0, \min(1, z))\)
trec	thresholded rectified linear	\(z \mbox{ if } z > 1 \mbox{ else } 0\)
tlin	thresholded linear	\(z \mbox{ if } \|z\| > 1 \mbox{ else } 0\)
rect:min	truncation	\(\min(1, z)\)
rect:max	rectification	\(\max(0, z)\)
norm:mean	mean-normalization	\(z - \bar{z}\)
norm:max	max-normalization	\(z / \max \|z\|\)
norm:std	variance-normalization	\(z / \mathbb{E}[(z-\bar{z})^2]\)
norm:z	z-score normalization	\((z-\bar{z}) / \mathbb{E}[(z-\bar{z})^2]\)
prelu	relu with parametric leak	\(\max(0, z) - \max(0, -rz)\)
lgrelu	relu with leak and gain	\(\max(0, gz) - \max(0, -rz)\)
maxout	piecewise linear	\(\max_i m_i z\)

Composition¶

Activation functions can also be composed by concatenating multiple function names togather using a +. For example, to create a layer that uses a batch-normalized hyperbolic tangent activation:

net = theanets.Regressor([10, (10, 'tanh+norm:z'), 10])

Just like function composition, the order of the components matters! Unlike the notation for mathematical function composition, the functions will be applied from left-to-right.

Custom Activations¶

To define a custom activation, create a subclass of theanets.Activation, and implement the __call__ method to make the class instance callable. The callable will be given one argument, the array of layer outputs to activate.

Theanets 0.7.3 documentation

Activation Functions¶

Composition¶

Custom Activations¶

Table Of Contents