The most common method for training a neural network model is to use a
stochastic gradient-based optimizer. In
theanets many of these algorithms
are available by interfacing with the
sgd: Stochastic gradient descent
nag: Nesterov’s accelerated gradient
rprop: Resilient backpropagation
esgd: Equilibrated SGD
In addition to the optimization algorithms provided by
theanets defines a few algorithms that are more specific to neural networks.
These trainers tend to take advantage of the layered structure of the loss
function for a network.
This trainer sets model parameters directly to samples drawn from the training data. This is a very fast “training” algorithm since all updates take place at once; however, often features derived directly from the training data require further tuning to perform well.
Layerwise (supervised) pretrainer
Greedy supervised layerwise pre-training: This trainer applies RMSProp to each layer sequentially.
Greedy unsupervised layerwise pre-training: This trainer applies RMSProp to a tied-weights “shadow” autoencoder using an unlabeled dataset, and then transfers the learned autoencoder weights to the model being trained.