TY - GEN
T1 - Attentive Language Models
AU - Salton, Giancarlo D.
AU - Ross, Robert J.
AU - Kelleher, John D.
N1 - Publisher Copyright:
©2017 AFNLP.
PY - 2017
Y1 - 2017
N2 - In this paper, we extend Recurrent Neural Network Language Models (RNN-LMs) with an attention mechanism. We show that an Attentive RNN-LM (with 14.5M parameters) achieves a better perplexity than larger RNN-LMs (with 66M parameters) and achieves performance comparable to an ensemble of 10 similar sized RNN-LMs. We also show that an Attentive RNN-LM needs less contextual information to achieve similar results to the state-of-the-art on the wikitext2 dataset.
AB - In this paper, we extend Recurrent Neural Network Language Models (RNN-LMs) with an attention mechanism. We show that an Attentive RNN-LM (with 14.5M parameters) achieves a better perplexity than larger RNN-LMs (with 66M parameters) and achieves performance comparable to an ensemble of 10 similar sized RNN-LMs. We also show that an Attentive RNN-LM needs less contextual information to achieve similar results to the state-of-the-art on the wikitext2 dataset.
UR - https://www.scopus.com/pages/publications/105019712180
M3 - Conference contribution
AN - SCOPUS:105019712180
T3 - 8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
SP - 441
EP - 450
BT - 8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
PB - Association for Computational Linguistics (ACL)
T2 - 8th International Joint Conference on Natural Language Processing, IJCNLP 2017
Y2 - 27 November 2017 through 1 December 2017
ER -