# theanets.activations.Maxout¶

class theanets.activations.Maxout(*args, **kwargs)

Arbitrary piecewise linear activation.

This activation is unusual in that it requires a parameter at initialization time: the number of linear pieces to use. Consider a layer for the moment with just one unit. A maxout activation with $$k$$ pieces uses a slope $$m_k$$ and an intercept $$b_k$$ for each linear piece. It then transforms the input to the maximum of all of the pieces:

$f(x) = \max_k m_k x + b_k$

The parameters $$m_k$$ and $$b_k$$ are learnable.

For layers with more than one unit, the maxout activation allocates a slope $$m_{ki}$$ and intercept $$b_{ki}$$ for each unit $$i$$ and each piece $$k$$. The activation for unit $$x_i$$ is:

$f(x_i) = \max_k m_{ki} x_i + b_{ki}$

Again, the slope and intercept parameters are learnable.

This activation is actually a generalization of the rectified linear activations; to see how, just allocate 2 pieces and set the intercepts to 0. The slopes of the relu activation are given by $$m = (0, 1)$$, those of the Prelu function are given by $$m = (r, 1)$$, and those of the LGrelu are given by $$m = (r, g)$$ where $$r$$ is the leak rate parameter and $$g$$ is a gain parameter.

Note

To use this activation in a network layer specification, provide an activation string of the form 'maxout:k', where k is an integer giving the number of piecewise functions.

For example, the layer tuple (100, 'rnn', 'maxout:10') specifies a vanilla RNN layer with 100 units and a maxout activation with 10 pieces.

Parameters: pieces : int Number of linear pieces to use in the activation.
__init__(*args, **kwargs)

Methods

 __init__(*args, **kwargs)