theanets.recurrent.Text¶
-
class
theanets.recurrent.
Text
(text, alpha=None, min_count=2, unknown='x00')¶ A class for handling sequential text data.
Parameters: text : str
A blob of text.
alpha : str, optional
An alphabet to use for representing characters in the text. If not provided, all characters from the text occurring at least
min_count
times will be used.min_count : int, optional
If the alphabet is to be computed from the text, discard characters that occur fewer than this number of times. Defaults to 2.
unknown : str, optional
A character to use to represent “out-of-alphabet” characters in the text. This must not be in the alphabet. Defaults to ‘’.
Attributes
text (str) A blob of text, with all non-alphabet characters replaced by the “unknown” character. alpha (str) A string containing each character in the alphabet. -
__init__
(text, alpha=None, min_count=2, unknown='\x00')¶
Methods
__init__
(text[, alpha, min_count, unknown])classifier_batches
(steps, batch_size[, rng])Create a callable that returns a batch of training data. decode
(enc)Encode a text string by replacing characters with alphabet index. encode
(txt)Encode a text string by replacing characters with alphabet index. -
classifier_batches
(steps, batch_size, rng=None)¶ Create a callable that returns a batch of training data.
Parameters: steps : int
Number of time steps in each batch.
batch_size : int
Number of training examples per batch.
rng :
numpy.random.RandomState
or int, optionalA random number generator, or an integer seed for a random number generator. If not provided, the random number generator will be created with an automatically chosen seed.
Returns: batch : callable
A callable that, when called, returns a batch of data that can be used to train a classifier model.
-
decode
(enc)¶ Encode a text string by replacing characters with alphabet index.
Parameters: classes : list of int
A sequence of alphabet index values to convert to text.
Returns: txt : str
A string containing corresponding characters from the alphabet.
-
encode
(txt)¶ Encode a text string by replacing characters with alphabet index.
Parameters: txt : str
A string to encode.
Returns: classes : list of int
A sequence of alphabet index values corresponding to the given text.
-