# Activation Functions¶

An activation function (sometimes also called a transfer function) specifies how the final output of a layer is computed from the weighted sums of the inputs.

By default, hidden layers in theanets use a rectified linear activation function: $$g(z) = \max(0, z)$$.

Output layers in theanets.Regressor and theanets.Autoencoder models use linear activations (i.e., the output is just the weighted sum of the inputs from the previous layer: $$g(z) = z$$), and the output layer in theanets.Classifier models uses a softmax activation: $$g(z) = \exp(z) / \sum\exp(z)$$.

To specify a different activation function for a layer, include an activation key chosen from the table below, or create a custom activation. As described in Specifying Layers, the activation key can be included in your model specification either using the activation keyword argument in a layer dictionary, or by including the key in a tuple with the layer size:

net = theanets.Regressor([10, (10, 'tanh'), 10])


The activations that theanets provides are:

Key Description $$g(z) =$$
linear linear $$z$$
sigmoid logistic sigmoid $$(1 + \exp(-z))^{-1}$$
logistic logistic sigmoid $$(1 + \exp(-z))^{-1}$$
tanh hyperbolic tangent $$\tanh(z)$$
softplus smooth relu approximation $$\log(1 + \exp(z))$$
softmax categorical distribution $$\exp(z) / \sum\exp(z)$$
relu rectified linear $$\max(0, z)$$
trel truncated rectified linear $$\max(0, \min(1, z))$$
trec thresholded rectified linear $$z \mbox{ if } z > 1 \mbox{ else } 0$$
tlin thresholded linear $$z \mbox{ if } |z| > 1 \mbox{ else } 0$$
rect:min truncation $$\min(1, z)$$
rect:max rectification $$\max(0, z)$$
norm:mean mean-normalization $$z - \bar{z}$$
norm:max max-normalization $$z / \max |z|$$
norm:std variance-normalization $$z / \mathbb{E}[(z-\bar{z})^2]$$
norm:z z-score normalization $$(z-\bar{z}) / \mathbb{E}[(z-\bar{z})^2]$$
prelu relu with parametric leak $$\max(0, z) - \max(0, -rz)$$
lgrelu relu with leak and gain $$\max(0, gz) - \max(0, -rz)$$
maxout piecewise linear $$\max_i m_i z$$

## Composition¶

Activation functions can also be composed by concatenating multiple function names togather using a +. For example, to create a layer that uses a batch-normalized hyperbolic tangent activation:

net = theanets.Regressor([10, (10, 'tanh+norm:z'), 10])


Just like function composition, the order of the components matters! Unlike the notation for mathematical function composition, the functions will be applied from left-to-right.

## Custom Activations¶

To define a custom activation, create a subclass of theanets.Activation, and implement the __call__ method to make the class instance callable. The callable will be given one argument, the array of layer outputs to activate.