theanets.feedforward.Classifier¶
-
class
theanets.feedforward.
Classifier
(layers, weighted=False)¶ A classifier attempts to match a 1-hot target output.
Classification models in
theanets
are trained by optimizing a (possibly regularized) loss that centers around the categorical cross-entropy. This error computes the difference between the distribution generated by the classification model and the empirical distribution of the labeled data.If we have a labeled dataset containing \(m\) \(d\)-dimensional input samples \(X \in \mathbb{R}^{m \times d}\) and \(m\) paired target outputs \(Y \in \{0,1,\dots,K-1\}^m\), then the loss that the
Classifier
model optimizes with respect to the model parameters \(\theta\) is:\[\mathcal{L}(X, Y, \theta) = R(X, \theta) - \frac{1}{m} \sum_{i=1}^m \sum_{k=0}^{K-1} p(k | y_i) \log q_\theta(k | x_i)\]Here, \(p(k|y_i)\) is the probability that example \(i\) is labeled with class \(k\); in
theanets
classification models, this is 1 if \(k = y_i\) and 0 otherwise—so, in practice, the sum over classes reduces to a single term. Next, \(q_\theta(k|x_i)\) is the probability that the model assigns to class \(k\) given input \(x_i\); this corresponds to the relevant softmax output from the model. Finally, \(R\) is a regularization function.A classifier model requires the following inputs at training time:
x
: A two-dimensional array of input data. Each row ofx
is expected to be one data item. Each column ofx
holds the measurements of a particular input variable across all data items.labels
: A one-dimensional array of target labels. Each element oflabels
is expected to be the class index for a single data item.
The number of rows in
x
must match the number of elements in thelabels
vector. Additionally, the values inlabels
are expected to range from 0 to one less than the number of classes in the data being modeled. For example, for the MNIST digits dataset, which represents digits 0 through 9, the labels array contains integer class labels 0 through 9.-
__init__
(layers, weighted=False)¶
Methods
accuracy
(outputs)Build a theano expression for computing the network accuracy. classify
(x)error
(outputs)Build a theano expression for computing the network error. monitors
(**kwargs)Return expressions that should be computed to monitor training. predict
(x)Compute a greedy classification for the given set of data. predict_logit
(x)Compute the logit values that underlie the softmax output. predict_proba
(x)Compute class posterior probabilities for the given set of data. score
(x, y[, w])Compute the mean accuracy on a set of labeled data. Attributes
DEFAULT_OUTPUT_ACTIVATION
num_params
Number of parameters in the entire network model. params
A list of the learnable theano parameters for this network. -
accuracy
(outputs)¶ Build a theano expression for computing the network accuracy.
Parameters: outputs : dict mapping str to theano expression
A dictionary of all outputs generated by the layers in this network.
Returns: acc : theano expression
A theano expression representing the network accuracy.
-
error
(outputs)¶ Build a theano expression for computing the network error.
Parameters: outputs : dict mapping str to theano expression
A dictionary of all outputs generated by the layers in this network.
Returns: error : theano expression
A theano expression representing the network error.
-
monitors
(**kwargs)¶ Return expressions that should be computed to monitor training.
Returns: monitors : list of (name, expression) pairs
A list of named monitor expressions to compute for this network.
-
predict
(x)¶ Compute a greedy classification for the given set of data.
Parameters: x : ndarray (num-examples, num-variables)
An array containing examples to classify. Examples are given as the rows in this array.
Returns: k : ndarray (num-examples, )
A vector of class index values, one per row of input data.
-
predict_logit
(x)¶ Compute the logit values that underlie the softmax output.
Parameters: x : ndarray (num-examples, num-variables)
An array containing examples to classify. Examples are given as the rows in this array.
Returns: l : ndarray (num-examples, num-classes)
An array of posterior class logit values, one row of logit values per row of input data.
-
predict_proba
(x)¶ Compute class posterior probabilities for the given set of data.
Parameters: x : ndarray (num-examples, num-variables)
An array containing examples to predict. Examples are given as the rows in this array.
Returns: p : ndarray (num-examples, num-classes)
An array of class posterior probability values, one per row of input data.
-
score
(x, y, w=None)¶ Compute the mean accuracy on a set of labeled data.
Parameters: x : ndarray (num-examples, num-variables)
An array containing examples to classify. Examples are given as the rows in this array.
y : ndarray (num-examples, )
A vector of integer class labels, one for each row of input data.
w : ndarray (num-examples, )
A vector of weights, one for each row of input data.
Returns: score : float
The (possibly weighted) mean accuracy of the model on the data.