Models¶

Classes:

`ALBERT`([name])	Uses an ALBERT from Tensorflow Hub as embedding layer from: https://tfhub.dev/google/albert_base/3
`BERT`([name])	Uses an BERT from Tensorflow Hub as embedding layer from: https://github.com/strongio/keras-bert/blob/master/keras-bert.py
`BiGRU`([name])
`BiGRUAttn`([name])
`BiLSTM`([name])
`BiLSTMAttn`([name])
`CNN`([name])
`CNNAttn`([name])
`ConveRT`([name])	Uses ConveRT Tensorflow Hub module as embedding layer from: https://github.com/PolyAI-LDN/polyai-models
`DCNN`([name])	Implements the DCNN model from:
`DCNNAttn`([name])	Implements the DCNN model from:
`DeepBiGRU`([name])
`DeepBiGRUAttn`([name])
`DeepBiLSTM`([name])
`DeepBiLSTMAttn`([name])
`DeepGRU`([name])
`DeepGRUAttn`([name])
`DeepLSTM`([name])
`DeepLSTMAttn`([name])
`ELMo`([name])	Uses ELMo from Tensorflow Hub as embedding layer from: https://github.com/strongio/keras-elmo/blob/master/Elmo%20Keras.ipynb https://github.com/JHart96/keras_elmo_embedding_layer/blob/master/elmo.py
`GRU`([name])
`GRUAttn`([name])
`GRUCRF`([name])	Uses CRF layer from keras_contrib: https://github.com/keras-team/keras-contrib/tree/master/keras_contrib
`LSTM`([name])
`LSTMAttn`([name])
`LSTMCRF`([name])	Uses CRF layer from keras_contrib: https://github.com/keras-team/keras-contrib/tree/master/keras_contrib
`MLSTMCharLM`([name])	Uses an mLSTM Character Language Model as embedding layer from: https://github.com/openai/generating-reviews-discovering-sentiment
`Model`([name])	Model abstract class.
`NeuralNetworkLanguageModel`([name])	Uses Neural Network Language Model from Tensorflow Hub.
`RCNN`([name])	Implements the Recurrent Convolutional Network from:
`RCNNAttn`([name])	Implements the Recurrent Convolutional Network from:
`TextCNN`([name])	Implements the Text CNN model from:
`TextCNNAttn`([name])	Implements the Text CNN model from:
`UniversalSentenceEncoder`([name])	Uses an Universal Sentence Encoder from Tensorflow Hub as embedding layer.

Functions:

get_model(model_name, input_shape, …[, …])

Utility function for returning a Keras model.

class models.ALBERT(name='ALBERT')¶

Bases: models.Model

Uses an ALBERT from Tensorflow Hub as embedding layer from: https://tfhub.dev/google/albert_base/3

Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. arXiv preprint arXiv:1909.11942, 2019.

Module urls: albert_base - “https://tfhub.dev/google/albert_base/3” albert_large - “https://tfhub.dev/google/albert_large/3” albert_xlarge - “https://tfhub.dev/google/albert_xlarge/3” albert_xxlarge - “https://tfhub.dev/google/albert_xxlarge/3”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.BERT(name='BERT')¶

Bases: models.Model

Uses an BERT from Tensorflow Hub as embedding layer from: https://github.com/strongio/keras-bert/blob/master/keras-bert.py

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805, 2018.

Module urls: bert_base - “https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1” bert_large - “https://tfhub.dev/google/bert_uncased_L-24_H-1024_A-16/1”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.BiGRU(name='BiGRU')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.BiGRUAttn(name='BiGRUAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.BiLSTM(name='BiLSTM')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.BiLSTMAttn(name='BiLSTMAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.CNN(name='CNN')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.CNNAttn(name='CNNAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.ConveRT(name='ConveRT')¶

Bases: models.Model

Uses ConveRT Tensorflow Hub module as embedding layer from: https://github.com/PolyAI-LDN/polyai-models

Henderson, M., Casanueva, I., Mrkšić, N., Su, P.-H., Tsung-Hsien and Vulić, I. (2019) ConveRT: Efficient and Accurate Conversational Representations from Transformers. arXiv [online]. Available from: http://arxiv.org/abs/1911.03688 [Accessed 13 November 2019].

Module url: “http://models.poly-ai.com/convert/v1/model.tar.gz”

Note: Requires tensorflow-text to be installed (TODO currently unavailable on windows).

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DCNN(name='DCNN')¶

Bases: models.Model

Implements the DCNN model from:

Kalchbrenner, N., Grefenstette, E. and Blunsom, P. (2014) A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.

Model code from: “https://github.com/AlexYangLi/TextClassification”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DCNNAttn(name='DCNNAttn')¶

Bases: models.Model

Implements the DCNN model from:

Kalchbrenner, N., Grefenstette, E. and Blunsom, P. (2014) A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.

Model code from: “https://github.com/AlexYangLi/TextClassification”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepBiGRU(name='DeepBiGRU')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepBiGRUAttn(name='DeepBiGRUAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepBiLSTM(name='DeepBiLSTM')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepBiLSTMAttn(name='DeepBiLSTMAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepGRU(name='DeepGRU')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepGRUAttn(name='DeepGRUAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepLSTM(name='DeepLSTM')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.DeepLSTMAttn(name='DeepLSTMAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.ELMo(name='ELMo')¶

Bases: models.Model

Uses ELMo from Tensorflow Hub as embedding layer from: https://github.com/strongio/keras-elmo/blob/master/Elmo%20Keras.ipynb https://github.com/JHart96/keras_elmo_embedding_layer/blob/master/elmo.py

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer. Deep contextualized word representations. arXiv preprint arXiv:1802.05365, 2018.

Module url: “https://tfhub.dev/google/elmo/2”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.GRU(name='GRU')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.GRUAttn(name='GRUAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.GRUCRF(name='GRUCRF')¶

Bases: models.Model

Uses CRF layer from keras_contrib: https://github.com/keras-team/keras-contrib/tree/master/keras_contrib

Note: The labels must be of shape [batch_size, num_labels, 1]

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.LSTM(name='LSTM')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.LSTMAttn(name='LSTMAttn')¶

Bases: models.Model

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.LSTMCRF(name='LSTMCRF')¶

Bases: models.Model

Uses CRF layer from keras_contrib: https://github.com/keras-team/keras-contrib/tree/master/keras_contrib

Note: The labels must be of shape [batch_size, num_labels, 1]

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.MLSTMCharLM(name='mLSTMCharLM')¶

Bases: models.Model

Uses an mLSTM Character Language Model as embedding layer from: https://github.com/openai/generating-reviews-discovering-sentiment

Radford, A., Jozefowicz, R. and Sutskever, I. (2018) ‘Learning to Generate Reviews and Discovering Sentiment’, arXiv. Available at: http://arxiv.org/abs/1704.01444

Implements the model as described in (if return_type=’mean’ and max_seq_length=64): Bothe, C. et al. (2018) ‘A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks’, in Eleventh International Conference on Language Resources and Evaluation (LREC 2018).

Note: batch_size and max_seq_length must be manually set for the MLSTMCharLMLayer, see mlstm_char_lm_layer.py. Default embedding dimension is 4,096.

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.Model(name='model')¶

Bases: object

Model abstract class.

Methods:

build_model(input_shape, output_shape, …)

Defines the model architecture using the Keras functional API.

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

Defines the model architecture using the Keras functional API.

Example of 2 layer feed forward network with embeddings:

# Unpack key word arguments dense_units = kwargs[‘dense_units’] if ‘dense_units’ in kwargs.keys() else 100

# Define model inputs = tf.keras.Input(shape=input_shape, name=’input_layer’) x = tf.keras.layers_t.Embedding(input_dim=embedding_matrix.shape[0], # Vocab size

output_dim=embedding_matrix.shape[1], # Embedding dim embeddings_initializer=tf.keras.initializers.Constant(embedding_matrix), input_length=input_shape[0], # Max seq length trainable=train_embeddings, name=’embedding_layer’)(inputs)

x = tf.keras.layers_t.GlobalMaxPooling1D(name=’global_pool’)(x) x = tf.keras.layers_t.Dense(dense_units, activation=’relu’, name=’dense_1’)(x) outputs = tf.keras.layers_t.Dense(output_shape, activation=’softmax’, name=’output_layer’)(x)

# Create keras model model = tf.keras.Model(inputs=inputs, outputs=outputs, name=self.name)

# Create optimiser optimiser = optimisers.get_optimiser(optimiser_type=optimiser, lr=learning_rate, **kwargs)

# Compile the model model.compile(loss=’sparse_categorical_crossentropy’, optimizer=optimiser, metrics=[‘accuracy’])

Parameters

input_shape (tuple) – The input shape excluding batch size, i.e (sequence_length, )
output_shape (int) – The output shape, i.e. number of classes to predict
embedding_matrix (nb.array) – A matrix of vocabulary_size rows and embedding_dim columns
train_embeddings (bool) – Whether to keep embeddings fixed during training
**kwargs (dict) – Optional dictionary of model parameters to use for specific implementations

Returns

Keras model instance

Return type

model (tf.keras.Model)

class models.NeuralNetworkLanguageModel(name='NeuralNetworkLanguageModel')¶

Bases: models.Model

Uses Neural Network Language Model from Tensorflow Hub.

Yoshua Bengio, Rejean Ducharme, Pascal Vincent, Christian Jauvin. A Neural Probabilistic Language Model. Journal of Machine Learning Research, 3:1137-1155, 2003.

Module url: “https://tfhub.dev/google/nnlm-en-dim128/1”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.RCNN(name='RCNN')¶

Bases: models.Model

Implements the Recurrent Convolutional Network from:

Lai, S. et al. (2015) ‘Recurrent Convolutional Neural Networks for Text Classification’, in Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15).

Model code from: “https://github.com/AlexYangLi/TextClassification”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.RCNNAttn(name='RCNNAttn')¶

Bases: models.Model

Implements the Recurrent Convolutional Network from:

Lai, S. et al. (2015) ‘Recurrent Convolutional Neural Networks for Text Classification’, in Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15).

Model code from: “https://github.com/AlexYangLi/TextClassification”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.TextCNN(name='TextCNN')¶

Bases: models.Model

Implements the Text CNN model from:

Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.TextCNNAttn(name='TextCNNAttn')¶

Bases: models.Model

Implements the Text CNN model from:

Kim, Y. (2014). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

class models.UniversalSentenceEncoder(name='UniversalSentenceEncoder')¶

Bases: models.Model

Uses an Universal Sentence Encoder from Tensorflow Hub as embedding layer.

Daniel Cer, Yinfei Yang, Sheng-yi Kong, Nan Hua, Nicole Limtiaco, Rhomni St. John, Noah Constant, Mario Guajardo-Cespedes, Steve Yuan, Chris Tar, Yun-Hsuan Sung, Brian Strope, Ray Kurzweil. Universal Sentence Encoder. arXiv:1803.11175, 2018.

Module url: “https://tfhub.dev/google/universal-sentence-encoder-large/3”

Methods:

build_model(input_shape, output_shape, …)

build_model(input_shape, output_shape, embedding_matrix, train_embeddings=True, **kwargs)¶

models.get_model(model_name, input_shape, output_shape, model_params, embeddings=None, train_embeddings=True)¶

Utility function for returning a Keras model.

Parameters

model_name (str) – The name of the model
input_shape (tuple) – The input shape excluding batch size, i.e (sequence_length, )
output_shape (int) – The output shape, i.e. number of classes to predict
model_params (dict) – Optional dictionary of model parameters to use for specific implementations
embeddings (nb.array) – A matrix of vocabulary_size rows and embedding_dim columns
train_embeddings (bool) – Whether to keep embeddings fixed during training

Returns

Keras model instance

Return type

model (tf.keras.Model)