Activation Functions¶
An activation function (sometimes also called a transfer function) specifies how the final output of a layer is computed from the weighted sums of the inputs.
By default, hidden layers in theanets
use a rectified linear activation
function: \(g(z) = \max(0, z)\).
Output layers in theanets.Regressor
and theanets.Autoencoder
models use
linear activations (i.e., the output is just the weighted sum of the inputs from
the previous layer: \(g(z) = z\)), and the output layer in
theanets.Classifier
models uses a
softmax activation: \(g(z) = \exp(z) / \sum\exp(z)\).
To specify a different activation function for a layer, include an activation
key chosen from the table below, or create a custom activation. As described in Specifying Layers,
the activation key can be included in your model specification either using the
activation
keyword argument in a layer dictionary, or by including the key
in a tuple with the layer size:
net = theanets.Regressor([10, (10, 'tanh'), 10])
The activations that theanets
provides are:
Composition¶
Activation functions can also be composed by concatenating multiple function
names togather using a +
. For example, to create a layer that uses a
batch-normalized hyperbolic tangent activation:
net = theanets.Regressor([10, (10, 'tanh+norm:z'), 10])
Just like function composition, the order of the components matters! Unlike the notation for mathematical function composition, the functions will be applied from left-to-right.
Custom Activations¶
To define a custom activation, create a subclass of theanets.Activation
, and implement the __call__
method to
make the class instance callable. The callable will be given one argument, the
array of layer outputs to activate.
class ThresholdedLinear(theanets.Activation):
def __call__(self, x):
return x * (x > 1)
This example activation returns 0 if a layer output is less than 1, or the output value itself otherwise. In effect it is a linear activation for “large” outputs (i.e., greater than 1) and zero otherwise. To use it in a model, give the name of the activation:
net = theanets.Regressor([10, (10, 'thresholdedlinear'), 10])