theanets.feedforward.Autoencoder

class theanets.feedforward.Autoencoder(layers, weighted=False, sparse_input=False)

An autoencoder attempts to reproduce its input.

Some types of neural network models have been shown to learn useful features from a set of data without requiring any label information. This learning task is often referred to as feature learning or manifold learning. A class of neural network architectures known as autoencoders are ideally suited for this task. An autoencoder takes as input a data sample and attempts to produce the same data sample as its output. Formally, an autoencoder defines a mapping from a source space to itself:

\[F_\theta: \mathcal{S} \to \mathcal{S}\]

Often, this mapping can be decomposed into an “encoding” stage \(f_\alpha(\cdot)\) and a corresponding “decoding” stage \(g_\beta(\cdot)\) to and from some latent space \(\mathcal{Z} = \mathbb{R}^{n_z}\):

\[\begin{split}\begin{eqnarray*} f_\alpha &:& \mathcal{S} \to \mathcal{Z} \\ g_\beta &:& \mathcal{Z} \to \mathcal{S} \end{eqnarray*}\end{split}\]

Autoencoders form an interesting class of models for several reasons. They:

  • require only “unlabeled” data (which is typically easy to obtain),
  • are generalizations of many popular density estimation techniques, and
  • can be used to model the “manifold” or density of a dataset.

If we have a labeled dataset containing \(m\) \(d\)-dimensional input samples \(X \in \mathbb{R}^{m \times d}\), then the loss that the autoencoder model optimizes with respect to the model parameters \(\theta\) is:

\[\begin{split}\begin{eqnarray*} \mathcal{L}(X, \theta) &=& \frac{1}{m} \sum_{i=1}^m \| F_\theta(x_i) - x_i \|_2^2 + R(X, \theta) \\ &=& \frac{1}{m} \sum_{i=1}^m \| g_\beta(f_\alpha(x_i)) - x_i \|_2^2 + R(X, \alpha, \beta) \end{eqnarray*}\end{split}\]

where \(R\) is a regularization function.

A generic autoencoder can be defined in theanets by using the Autoencoder class:

exp = theanets.Experiment(theanets.Autoencoder)

The layers parameter is required to define such a model; it can be provided on the command-line by using --layers A B C ... A, or in your code:

exp = theanets.Experiment(
    theanets.Autoencoder,
    layers=(A, B, C, ..., A))

Autoencoders retain all attributes of the parent Network class, but additionally can have “tied weights”, if the layer configuration is palindromic.

__init__(layers, weighted=False, sparse_input=False)

Methods

decode(z[, layer]) Decode an encoded dataset by computing the output layer activation.
encode(x[, layer, sample]) Encode a dataset using the hidden layer activations of our network.
score(x[, w]) Compute R^2 coefficient of determination for a given input.

Attributes

num_params Number of parameters in the entire network model.
params A list of the learnable theano parameters for this network.
tied_weights A boolean indicating whether this network uses tied weights.
decode(z, layer=None)

Decode an encoded dataset by computing the output layer activation.

Parameters:

z : ndarray

A matrix containing encoded data from this autoencoder.

layer : int or str or Layer, optional

The index or name of the hidden layer that was used to encode z.

Returns:

decoded : ndarray

The decoded dataset.

encode(x, layer=None, sample=False)

Encode a dataset using the hidden layer activations of our network.

Parameters:

x : ndarray

A dataset to encode. Rows of this dataset capture individual data points, while columns represent the variables in each data point.

layer : str, optional

The name of the hidden layer output to use. By default, we use the “middle” hidden layer—for example, for a 4,2,4 or 4,3,2,3,4 autoencoder, we use the “2” layer (typically named “hid1” or “hid2”, respectively).

sample : bool, optional

If True, then draw a sample using the hidden activations as independent Bernoulli probabilities for the encoded data. This assumes the hidden layer has a logistic sigmoid activation function.

Returns:

ndarray :

The given dataset, encoded by the appropriate hidden layer activation.

score(x, w=None)

Compute R^2 coefficient of determination for a given input.

Parameters:

x : ndarray (num-examples, num-inputs)

An array containing data to be fed into the network. Multiple examples are arranged as rows in this array, with columns containing the variables for each example.

Returns:

r2 : float

The R^2 correlation between the prediction of this netork and its input. This can serve as one measure of the information loss of the autoencoder.

tied_weights

A boolean indicating whether this network uses tied weights.