Evaluation of a substitution method for idiom transformation in statistical machine translation

Giancarlo D. Salton, Robert J. Ross, John D. Kelleher

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We evaluate a substitution based technique for improving Statistical Machine Translation performance on idiomatic multiword expressions. The method operates by performing substitution on the original idiom with its literal meaning before translation, with a second substitution step replacing literal meanings with idioms following translation. We detail our approach, outline our implementation and provide an evaluation of the method for the language pair English/Brazilian-Portuguese. Our results show improvements in translation accuracy on sentences containing either morphosyntactically constrained or unconstrained idioms. We discuss the consequences of our results and outline potential extensions to this process.

Original languageEnglish
Title of host publicationMWE 2014 - Proceedings of the 10th Workshop on Multiword Expressions, in conjunction with the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014
PublisherAssociation for Computational Linguistics (ACL)
Pages38-42
Number of pages5
ISBN (Electronic)9781937284879
DOIs
Publication statusPublished - 2014
Event10th Workshop on Multiword Expressions, MWE 2014 - Gothenburg, Sweden
Duration: 26 Apr 201427 Apr 2014

Publication series

NameMWE 2014 - Proceedings of the 10th Workshop on Multiword Expressions, in conjunction with the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2014

Conference

Conference10th Workshop on Multiword Expressions, MWE 2014
Country/TerritorySweden
CityGothenburg
Period26/04/1427/04/14

Keywords

  • Statistical Machine Translation
  • idiomatic multiword expressions
  • substitution technique
  • translation accuracy
  • English/Brazilian-Portuguese

Fingerprint

Dive into the research topics of 'Evaluation of a substitution method for idiom transformation in statistical machine translation'. Together they form a unique fingerprint.

Cite this