Trainers¶
The most common method for training a neural network model is to use a
stochastic gradient-based optimizer. In theanets many of these algorithms
are available by interfacing with the downhill package:
sgd: Stochastic gradient descentnag: Nesterov’s accelerated gradientrprop: Resilient backpropagationrmsprop: RMSPropadadelta: ADADELTAesgd: Equilibrated SGDadam: Adam
In addition to the optimization algorithms provided by downhill,
theanets defines a few algorithms that are more specific to neural networks.
These trainers tend to take advantage of the layered structure of the loss
function for a network.
sample:Sample trainer
This trainer sets model parameters directly to samples drawn from the training data. This is a very fast “training” algorithm since all updates take place at once; however, often features derived directly from the training data require further tuning to perform well.
layerwise:Layerwise (supervised) pretrainer
Greedy supervised layerwise pre-training: This trainer applies RMSProp to each layer sequentially.
pretrain:Unsupervised pretrainer
Greedy unsupervised layerwise pre-training: This trainer applies RMSProp to a tied-weights “shadow” autoencoder using an unlabeled dataset, and then transfers the learned autoencoder weights to the model being trained.