# theanets.feedforward.Autoencoder¶

class theanets.feedforward.Autoencoder(layers, loss='mse', weighted=False, rng=13)

An autoencoder network attempts to reproduce its input.

Notes

Autoencoder models default to a MSE loss. To use a different loss, provide a non-default argument for the loss keyword argument when constructing your model.

Formally, an autoencoder defines a parametric mapping from a data space to the same space:

$F_\theta: \mathcal{S} \to \mathcal{S}$

Often, this mapping can be decomposed into an “encoding” stage $$f_\alpha(\cdot)$$ and a corresponding “decoding” stage $$g_\beta(\cdot)$$ to and from some latent space $$\mathcal{Z} = \mathbb{R}^{n_z}$$:

$\begin{split}\begin{eqnarray*} f_\alpha &:& \mathcal{S} \to \mathcal{Z} \\ g_\beta &:& \mathcal{Z} \to \mathcal{S} \end{eqnarray*}\end{split}$

Autoencoders form an interesting class of models for several reasons. They:

• require only “unlabeled” data (which is typically easy to obtain),
• are generalizations of many popular density estimation techniques, and
• can be used to model the “manifold” or density of a dataset.

Many extremely common dimensionality reduction techniques can be expressed as autoencoders. For instance, Principal Component Analysis (PCA) can be expressed as a model with two tied, linear layers:

>>> pca = theanets.Autoencoder([10, (5, 'linear'), (10, 'tied')])


Similarly, Independent Component Analysis (ICA) can be expressed as the same model, but trained with a sparsity penalty on the hidden-layer activations:

>>> ica = pca
>>> ica.train([inputs], hidden_l1=0.1)


In this light, “nonlinear PCA” is quite easy to formulate as well!

Examples

To create an autoencoder, just create a new model instance. Often you’ll provide the layer configuration at this time:

>>> model = theanets.Autoencoder([10, 20, 10])


If you want to create an autoencoder with tied weights, specify that layer type when creating the model:

>>> model = theanets.Autoencoder([10, 20, (10, 'tied')])


Data

Training data for an autoencoder takes the form of a two-dimensional array. The shape of this array is (num-examples, num-variables): the first axis enumerates data points in a batch, and the second enumerates the variables in the model.

For instance, to create a training dataset containing 1000 examples:

>>> inputs = np.random.randn(1000, 10).astype('f')


Training

Training the model can be as simple as calling the train() method:

>>> model.train([inputs])


Use

A model can be used to predict() the output of some input data points:

>>> test = np.random.randn(3, 10).astype('f')
>>> print(model.predict(test))


Additionally, autoencoders can encode() a set of input data points:

>>> enc = model.encode(test)
>>> enc.shape
(3, 20)


The model can also decode() a set of encoded data:

>>> model.decode(enc)


__init__(layers, loss='mse', weighted=False, rng=13)

Methods

 __init__(layers[, loss, weighted, rng]) add_layer([layer, is_output]) Add a layer to our network graph. add_loss([loss]) Add a loss function to the model. build_graph([regularizers]) Connect the layers in this network to form a computation graph. decode(z[, layer]) Decode an encoded dataset by computing the output layer activation. encode(x[, layer, sample]) Encode a dataset using the hidden layer activations of our network. feed_forward(x, **kwargs) Compute a forward pass of all layers from the given input. find(which, param) Get a parameter from a layer in the network. itertrain(train[, valid, algo, subalgo, ...]) Train our network, one batch at a time. load(filename) Load a saved network from disk. loss(**kwargs) Return a variable representing the regularized loss for this network. monitors(**kwargs) Return expressions that should be computed to monitor training. predict(x, **kwargs) Compute a forward pass of the inputs, returning the network output. save(filename) Save the state of this network to a pickle file on disk. score(x[, w]) Compute R^2 coefficient of determination for a given input. set_loss(*args, **kwargs) Clear the current loss functions from the network and add a new one. train(*args, **kwargs) Train the network until the trainer converges. updates(**kwargs) Return expressions to run as updates during network training.

Attributes

 DEFAULT_OUTPUT_ACTIVATION INPUT_NDIM OUTPUT_NDIM inputs A list of Theano variables for feedforward computations. num_params Number of parameters in the entire network model. params A list of the learnable Theano parameters for this network. variables A list of Theano variables for loss computations.
decode(z, layer=None)

Decode an encoded dataset by computing the output layer activation.

Parameters: z : ndarray A matrix containing encoded data from this autoencoder. layer : int or str or Layer, optional The index or name of the hidden layer that was used to encode z. decoded : ndarray The decoded dataset.
encode(x, layer=None, sample=False)

Encode a dataset using the hidden layer activations of our network.

Parameters: x : ndarray A dataset to encode. Rows of this dataset capture individual data points, while columns represent the variables in each data point. layer : str, optional The name of the hidden layer output to use. By default, we use the “middle” hidden layer—for example, for a 4,2,4 or 4,3,2,3,4 autoencoder, we use the layer with size 2. sample : bool, optional If True, then draw a sample using the hidden activations as independent Bernoulli probabilities for the encoded data. This assumes the hidden layer has a logistic sigmoid activation function. ndarray : The given dataset, encoded by the appropriate hidden layer activation.
score(x, w=None)

Compute R^2 coefficient of determination for a given input.

Parameters: x : ndarray (num-examples, num-inputs) An array containing data to be fed into the network. Multiple examples are arranged as rows in this array, with columns containing the variables for each example. r2 : float The R^2 correlation between the prediction of this netork and its input. This can serve as one measure of the information loss of the autoencoder.