theanets.feedforward.Autoencoder¶

class
theanets.feedforward.
Autoencoder
(layers, loss='mse', weighted=False, rng=13)¶ An autoencoder network attempts to reproduce its input.
Notes
Autoencoder models default to a
MSE
loss. To use a different loss, provide a nondefault argument for theloss
keyword argument when constructing your model.Formally, an autoencoder defines a parametric mapping from a data space to the same space:
\[F_\theta: \mathcal{S} \to \mathcal{S}\]Often, this mapping can be decomposed into an “encoding” stage \(f_\alpha(\cdot)\) and a corresponding “decoding” stage \(g_\beta(\cdot)\) to and from some latent space \(\mathcal{Z} = \mathbb{R}^{n_z}\):
\[\begin{split}\begin{eqnarray*} f_\alpha &:& \mathcal{S} \to \mathcal{Z} \\ g_\beta &:& \mathcal{Z} \to \mathcal{S} \end{eqnarray*}\end{split}\]Autoencoders form an interesting class of models for several reasons. They:
 require only “unlabeled” data (which is typically easy to obtain),
 are generalizations of many popular density estimation techniques, and
 can be used to model the “manifold” or density of a dataset.
Many extremely common dimensionality reduction techniques can be expressed as autoencoders. For instance, Principal Component Analysis (PCA) can be expressed as a model with two tied, linear layers:
>>> pca = theanets.Autoencoder([10, (5, 'linear'), (10, 'tied')])
Similarly, Independent Component Analysis (ICA) can be expressed as the same model, but trained with a sparsity penalty on the hiddenlayer activations:
>>> ica = pca >>> ica.train([inputs], hidden_l1=0.1)
In this light, “nonlinear PCA” is quite easy to formulate as well!
Examples
To create an autoencoder, just create a new model instance. Often you’ll provide the layer configuration at this time:
>>> model = theanets.Autoencoder([10, 20, 10])
If you want to create an autoencoder with tied weights, specify that layer type when creating the model:
>>> model = theanets.Autoencoder([10, 20, (10, 'tied')])
See Creating a Model for more information.
Data
Training data for an autoencoder takes the form of a twodimensional array. The shape of this array is (numexamples, numvariables): the first axis enumerates data points in a batch, and the second enumerates the variables in the model.
For instance, to create a training dataset containing 1000 examples:
>>> inputs = np.random.randn(1000, 10).astype('f')
Training
Training the model can be as simple as calling the
train()
method:>>> model.train([inputs])
See Training a Model for more information about training.
Use
A model can be used to
predict()
the output of some input data points:>>> test = np.random.randn(3, 10).astype('f') >>> print(model.predict(test))
Additionally, autoencoders can
encode()
a set of input data points:>>> enc = model.encode(test) >>> enc.shape (3, 20)
The model can also
decode()
a set of encoded data:>>> model.decode(enc)
See Using a Model for more information about using models.

__init__
(layers, loss='mse', weighted=False, rng=13)¶
Methods
__init__
(layers[, loss, weighted, rng])add_layer
([layer, is_output])Add a layer to our network graph. add_loss
([loss])Add a loss function to the model. build_graph
([regularizers])Connect the layers in this network to form a computation graph. decode
(z[, layer])Decode an encoded dataset by computing the output layer activation. encode
(x[, layer, sample])Encode a dataset using the hidden layer activations of our network. feed_forward
(x, **kwargs)Compute a forward pass of all layers from the given input. find
(which, param)Get a parameter from a layer in the network. itertrain
(train[, valid, algo, subalgo, ...])Train our network, one batch at a time. load
(filename)Load a saved network from disk. loss
(**kwargs)Return a variable representing the regularized loss for this network. monitors
(**kwargs)Return expressions that should be computed to monitor training. predict
(x, **kwargs)Compute a forward pass of the inputs, returning the network output. save
(filename)Save the state of this network to a pickle file on disk. score
(x[, w])Compute R^2 coefficient of determination for a given input. set_loss
(*args, **kwargs)Clear the current loss functions from the network and add a new one. train
(*args, **kwargs)Train the network until the trainer converges. updates
(**kwargs)Return expressions to run as updates during network training. Attributes
DEFAULT_OUTPUT_ACTIVATION
INPUT_NDIM
OUTPUT_NDIM
inputs
A list of Theano variables for feedforward computations. num_params
Number of parameters in the entire network model. params
A list of the learnable Theano parameters for this network. variables
A list of Theano variables for loss computations. 
decode
(z, layer=None)¶ Decode an encoded dataset by computing the output layer activation.
Parameters: z : ndarray
A matrix containing encoded data from this autoencoder.
layer : int or str or
Layer
, optionalThe index or name of the hidden layer that was used to encode z.
Returns: decoded : ndarray
The decoded dataset.

encode
(x, layer=None, sample=False)¶ Encode a dataset using the hidden layer activations of our network.
Parameters: x : ndarray
A dataset to encode. Rows of this dataset capture individual data points, while columns represent the variables in each data point.
layer : str, optional
The name of the hidden layer output to use. By default, we use the “middle” hidden layer—for example, for a 4,2,4 or 4,3,2,3,4 autoencoder, we use the layer with size 2.
sample : bool, optional
If True, then draw a sample using the hidden activations as independent Bernoulli probabilities for the encoded data. This assumes the hidden layer has a logistic sigmoid activation function.
Returns: ndarray :
The given dataset, encoded by the appropriate hidden layer activation.

score
(x, w=None)¶ Compute R^2 coefficient of determination for a given input.
Parameters: x : ndarray (numexamples, numinputs)
An array containing data to be fed into the network. Multiple examples are arranged as rows in this array, with columns containing the variables for each example.
Returns: r2 : float
The R^2 correlation between the prediction of this netork and its input. This can serve as one measure of the information loss of the autoencoder.