# Activation Functions¶

An activation function (sometimes also called a transfer function) specifies how the final output of a layer is computed from the weighted sums of the inputs.

By default, hidden layers in `theanets`

use a rectified linear activation
function: \(g(z) = \max(0, z)\).

Output layers in `theanets.Regressor`

and `theanets.Autoencoder`

models use
linear activations (i.e., the output is just the weighted sum of the inputs from
the previous layer: \(g(z) = z\)), and the output layer in
`theanets.Classifier`

models uses a
softmax activation: \(g(z) = \exp(z) / \sum\exp(z)\).

To specify a different activation function for a layer, include an activation
key chosen from the table below, or create a custom activation. As described in Specifying Layers,
the activation key can be included in your model specification either using the
`activation`

keyword argument in a layer dictionary, or by including the key
in a tuple with the layer size:

```
net = theanets.Regressor([10, (10, 'tanh'), 10])
```

The activations that `theanets`

provides are:

Key | Description | \(g(z) =\) |
---|---|---|

linear | linear | \(z\) |

sigmoid | logistic sigmoid | \((1 + \exp(-z))^{-1}\) |

logistic | logistic sigmoid | \((1 + \exp(-z))^{-1}\) |

tanh | hyperbolic tangent | \(\tanh(z)\) |

softplus | smooth relu approximation | \(\log(1 + \exp(z))\) |

softmax | categorical distribution | \(\exp(z) / \sum\exp(z)\) |

relu | rectified linear | \(\max(0, z)\) |

trel | truncated rectified linear | \(\max(0, \min(1, z))\) |

trec | thresholded rectified linear | \(z \mbox{ if } z > 1 \mbox{ else } 0\) |

tlin | thresholded linear | \(z \mbox{ if } |z| > 1 \mbox{ else } 0\) |

rect:min | truncation | \(\min(1, z)\) |

rect:max | rectification | \(\max(0, z)\) |

norm:mean | mean-normalization | \(z - \bar{z}\) |

norm:max | max-normalization | \(z / \max |z|\) |

norm:std | variance-normalization | \(z / \mathbb{E}[(z-\bar{z})^2]\) |

norm:z | z-score normalization | \((z-\bar{z}) / \mathbb{E}[(z-\bar{z})^2]\) |

prelu | relu with parametric leak | \(\max(0, z) - \max(0, -rz)\) |

lgrelu | relu with leak and gain | \(\max(0, gz) - \max(0, -rz)\) |

maxout | piecewise linear | \(\max_i m_i z\) |

## Composition¶

Activation functions can also be composed by concatenating multiple function
names togather using a `+`

. For example, to create a layer that uses a
batch-normalized hyperbolic tangent activation:

```
net = theanets.Regressor([10, (10, 'tanh+norm:z'), 10])
```

Just like function composition, the order of the components matters! Unlike the notation for mathematical function composition, the functions will be applied from left-to-right.

## Custom Activations¶

To define a custom activation, create a subclass of `theanets.Activation`

, and implement the `__call__`

method to
make the class instance callable. The callable will be given one argument, the
array of layer outputs to activate.