# Loss Functions¶

A loss function is used to optimize the parameter values in a neural network model. Loss functions map a set of parameter values for the network onto a scalar value that indicates how well those parameter accomplish the task the network is intended to do.

There are several common loss functions provided by `theanets`

. These losses
often measure the `squared`

or
`absolute`

error between a network’s
output and some target or desired output. Other loss functions are designed
specifically for classification models; the `cross-entropy`

is a common loss designed to minimize the
distance between the network’s distribution over class labels and the
distribution that the dataset defines.

Models in `theanets`

have at least one loss to optimize during
training. There are default losses for each of the built-in model types, but you
can often override these defaults just by providing a non-default value for the
`loss`

keyword argument when creating your model. For example, to create a
regression model with a mean absolute error loss:

```
net = theanets.Regressor([10, 20, 3], loss='mae')
```

This will create the regression model with the specified loss.

## Predefined Losses¶

These loss functions are available for neural network models.

`Loss` (target[, weight, weighted, output_name]) |
A loss function base class. |

`CrossEntropy` (target[, weight, weighted, ...]) |
Cross-entropy (XE) loss function for classifiers. |

`GaussianLogLikelihood` ([mean_name, ...]) |
Gaussian Log Likelihood (GLL) loss function. |

`Hinge` (target[, weight, weighted, output_name]) |
Hinge loss function for classifiers. |

`KullbackLeiblerDivergence` (target[, weight, ...]) |
The KL divergence loss is computed over probability distributions. |

`MaximumMeanDiscrepancy` ([kernel]) |
Maximum Mean Discrepancy (MMD) loss function. |

`MeanAbsoluteError` (target[, weight, ...]) |
Mean-absolute-error (MAE) loss function. |

`MeanSquaredError` (target[, weight, weighted, ...]) |
Mean-squared-error (MSE) loss function. |

## Multiple Losses¶

A `theanets`

model can actually have more than one loss that it attempts to
optimize simultaneously, and these losses can change between successive calls to
`train()`

. In fact, a model has a
`losses`

attribute that’s just a list of `theanets.Loss`

instances; these losses are weighted by a `weight`

attribute, then summed and combined with any applicable regularizers during each call to `train()`

.

Let’s say that you want to optimize a model using both the mean absolute and the mean squared error. You could first create a regular regression model:

```
net = theanets.Regressor([10, 20, 3])
```

and then add a new loss to the model:

```
net.add_loss('mse')
```

Then, when you call:

```
net.train(...)
```

the model will attempt to minimize the sum of the two losses.

You can specify the relative weight of the two losses by manipulating the
`weight`

attribute of each loss instance. For instance, if you want the MAE
loss to be twice as strong as the MSE loss:

```
net.losses[1].weight = 2
net.train(...)
```

Finally, if you want to reset the loss to the standard MSE:

```
net.set_loss('mse', weight=1)
```

(Here we’ve also shown how to specify the weight of the loss when adding or setting it to the model.)

## Using Weighted Targets¶

By default, the network models available in `theanets`

treat all inputs as
equal when computing the loss for the model. For example, a regression model
treats an error of 0.1 in component 2 of the output just the same as an error of
0.1 in component 3, and each example of a minibatch is treated with equal
importance when training a classifier.

However, there are times when all inputs to a neural network model are not to be treated equally. This is especially evident in recurrent models: sometimes, the inputs to a recurrent network might not contain the same number of time steps, but because the inputs are presented to the model using a rectangular minibatch array, all inputs must somehow be made to have the same size. One way to address this would be to cut off all inputs at the length of the shortest input, but then the network is not exposed to all input/output pairs during training.

Weighted targets can be used for any model in `theanets`

. For example, an
`autoencoder`

could use an array of
weights containing zeros and ones to solve a matrix completion task, where the
input array contains some “unknown” values. In such a case, the network is
required to reproduce the known values exactly (so these could be presented to
the model with weight 1), while filling in the unknowns with statistically
reasonable values (which could be presented to the model during training with
weight 0).

As another example, suppose a `classifier`

model is being trained in a binary
classification task where one of the classes—say, class A—is only present
0.1% of the time. In such a case, the network can achieve 99.9% accuracy by
always predicting class B, so during training it might be important to ensure
that errors in predicting A are “amplified” when computing the loss. You could
provide a large weight for training examples in class A to encourage the model
not to miss these examples.

All of these cases are possible to model in `theanets`

; just include
`weighted=True`

when you create your model:

```
net = theanets.recurrent.Autoencoder([3, (10, 'rnn'), 3], weighted=True)
```

When training a weighted model, the training and validation datasets require an
additional component: an array of floating-point values with the same shape as
the expected output of the model. For example, a non-recurrent Classifier model
would require a weight vector with each minibatch, of the same shape as the
labels array, so that the training and validation datasets would each have three
pieces: `sample`

, `label`

, and `weight`

. Each value in the weight array is
used as the weight for the corresponding error when computing the loss.

## Custom Losses¶

It’s pretty straightforward to create models in `theanets`

that use different
losses from the predefined `theanets.Classifier`

and `theanets.Autoencoder`

and `theanets.Regressor`

models. (The classifier uses categorical
cross-entropy (XE) as its default loss, and the other two both use mean squared
error, MSE.)

To define a model with a new loss, just create a new `theanets.Loss`

subclass and specify its name when you create your
model. For example, to create a regression model that uses a step function
averaged over all of the model inputs:

```
class Step(theanets.Loss):
def __call__(self, outputs):
return (outputs[self.output_name] > 0).mean()
net = theanets.Regressor([5, 6, 7], loss='step')
```

Your loss function implementation must return a Theano expression that reflects the loss for your model. If you wish to make your loss work with weighted outputs, you will also need to include a case for having weights:

```
class Step(theanets.Loss):
def __call__(self, outputs):
step = outputs[self.output_name] > 0
if self._weights:
return (self._weights * step).sum() / self._weights.sum()
else:
return step.mean()
```