Attentive Language Models

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we extend Recurrent Neural Network Language Models (RNN-LMs) with an attention mechanism. We show that an Attentive RNN-LM (with 14.5M parameters) achieves a better perplexity than larger RNN-LMs (with 66M parameters) and achieves performance comparable to an ensemble of 10 similar sized RNN-LMs. We also show that an Attentive RNN-LM needs less contextual information to achieve similar results to the state-of-the-art on the wikitext2 dataset.

Original languageEnglish
Title of host publication8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
PublisherAssociation for Computational Linguistics (ACL)
Pages441-450
Number of pages10
ISBN (Electronic)9781948087025
Publication statusPublished - 2017
Event8th International Joint Conference on Natural Language Processing, IJCNLP 2017 - Taipei, Taiwan
Duration: 27 Nov 20171 Dec 2017

Publication series

Name8th International Joint Conference on Natural Language Processing - Proceedings of the IJCNLP 2017, System Demonstrations
Volume1

Conference

Conference8th International Joint Conference on Natural Language Processing, IJCNLP 2017
Country/TerritoryTaiwan
CityTaipei
Period27/11/171/12/17

Fingerprint

Dive into the research topics of 'Attentive Language Models'. Together they form a unique fingerprint.

Cite this