theanets.feedforward.Autoencoder¶
-
class
theanets.feedforward.
Autoencoder
(layers, loss='mse', weighted=False, rng=13)¶ An autoencoder network attempts to reproduce its input.
Notes
Autoencoder models default to a
MSE
loss. To use a different loss, provide a non-default argument for theloss
keyword argument when constructing your model.Formally, an autoencoder defines a parametric mapping from a data space to the same space:
\[F_\theta: \mathcal{S} \to \mathcal{S}\]Often, this mapping can be decomposed into an “encoding” stage \(f_\alpha(\cdot)\) and a corresponding “decoding” stage \(g_\beta(\cdot)\) to and from some latent space \(\mathcal{Z} = \mathbb{R}^{n_z}\):
\[\begin{split}\begin{eqnarray*} f_\alpha &:& \mathcal{S} \to \mathcal{Z} \\ g_\beta &:& \mathcal{Z} \to \mathcal{S} \end{eqnarray*}\end{split}\]Autoencoders form an interesting class of models for several reasons. They:
- require only “unlabeled” data (which is typically easy to obtain),
- are generalizations of many popular density estimation techniques, and
- can be used to model the “manifold” or density of a dataset.
Many extremely common dimensionality reduction techniques can be expressed as autoencoders. For instance, Principal Component Analysis (PCA) can be expressed as a model with two tied, linear layers:
>>> pca = theanets.Autoencoder([10, (5, 'linear'), (10, 'tied')])
Similarly, Independent Component Analysis (ICA) can be expressed as the same model, but trained with a sparsity penalty on the hidden-layer activations:
>>> ica = pca >>> ica.train([inputs], hidden_l1=0.1)
In this light, “nonlinear PCA” is quite easy to formulate as well!
Examples
To create an autoencoder, just create a new model instance. Often you’ll provide the layer configuration at this time:
>>> model = theanets.Autoencoder([10, 20, 10])
If you want to create an autoencoder with tied weights, specify that layer type when creating the model:
>>> model = theanets.Autoencoder([10, 20, (10, 'tied')])
See Creating a Model for more information.
Data
Training data for an autoencoder takes the form of a two-dimensional array. The shape of this array is (num-examples, num-variables): the first axis enumerates data points in a batch, and the second enumerates the variables in the model.
For instance, to create a training dataset containing 1000 examples:
>>> inputs = np.random.randn(1000, 10).astype('f')
Training
Training the model can be as simple as calling the
train()
method:>>> model.train([inputs])
See Training a Model for more information about training.
Use
A model can be used to
predict()
the output of some input data points:>>> test = np.random.randn(3, 10).astype('f') >>> print(model.predict(test))
Additionally, autoencoders can
encode()
a set of input data points:>>> enc = model.encode(test) >>> enc.shape (3, 20)
The model can also
decode()
a set of encoded data:>>> model.decode(enc)
See Using a Model for more information about using models.
-
__init__
(layers, loss='mse', weighted=False, rng=13)¶
Methods
__init__
(layers[, loss, weighted, rng])add_layer
([layer, is_output])Add a layer to our network graph. add_loss
([loss])Add a loss function to the model. build_graph
([regularizers])Connect the layers in this network to form a computation graph. decode
(z[, layer])Decode an encoded dataset by computing the output layer activation. encode
(x[, layer, sample])Encode a dataset using the hidden layer activations of our network. feed_forward
(x, **kwargs)Compute a forward pass of all layers from the given input. find
(which, param)Get a parameter from a layer in the network. itertrain
(train[, valid, algo, subalgo, ...])Train our network, one batch at a time. load
(filename)Load a saved network from disk. loss
(**kwargs)Return a variable representing the regularized loss for this network. monitors
(**kwargs)Return expressions that should be computed to monitor training. predict
(x, **kwargs)Compute a forward pass of the inputs, returning the network output. save
(filename)Save the state of this network to a pickle file on disk. score
(x[, w])Compute R^2 coefficient of determination for a given input. set_loss
(*args, **kwargs)Clear the current loss functions from the network and add a new one. train
(*args, **kwargs)Train the network until the trainer converges. updates
(**kwargs)Return expressions to run as updates during network training. Attributes
DEFAULT_OUTPUT_ACTIVATION
INPUT_NDIM
OUTPUT_NDIM
inputs
A list of Theano variables for feedforward computations. num_params
Number of parameters in the entire network model. params
A list of the learnable Theano parameters for this network. variables
A list of Theano variables for loss computations. -
decode
(z, layer=None)¶ Decode an encoded dataset by computing the output layer activation.
Parameters: z : ndarray
A matrix containing encoded data from this autoencoder.
layer : int or str or
Layer
, optionalThe index or name of the hidden layer that was used to encode z.
Returns: decoded : ndarray
The decoded dataset.
-
encode
(x, layer=None, sample=False)¶ Encode a dataset using the hidden layer activations of our network.
Parameters: x : ndarray
A dataset to encode. Rows of this dataset capture individual data points, while columns represent the variables in each data point.
layer : str, optional
The name of the hidden layer output to use. By default, we use the “middle” hidden layer—for example, for a 4,2,4 or 4,3,2,3,4 autoencoder, we use the layer with size 2.
sample : bool, optional
If True, then draw a sample using the hidden activations as independent Bernoulli probabilities for the encoded data. This assumes the hidden layer has a logistic sigmoid activation function.
Returns: ndarray :
The given dataset, encoded by the appropriate hidden layer activation.
-
score
(x, w=None)¶ Compute R^2 coefficient of determination for a given input.
Parameters: x : ndarray (num-examples, num-inputs)
An array containing data to be fed into the network. Multiple examples are arranged as rows in this array, with columns containing the variables for each example.
Returns: r2 : float
The R^2 correlation between the prediction of this netork and its input. This can serve as one measure of the information loss of the autoencoder.