Exploiting wikipedia for entity name disambiguation in tweets

Muhammad Atif Qureshi, Colm O'Riordan, Gabriella Pasi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts are in need of automated methods for mining the social media repositories (in particular Twitter) to monitor the reputation of a particular entity. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to entity names. To address this issue in this paper we use "context phrases" in a tweet and Wikipedia disambiguated articles for a particular entity in a random forest classifier. Furthermore, we also utilize the concept of "relatedness" between tweet and entity using the Wikipedia category-article structure that captures the amount of discussion present inside a tweet related to an entity. The experimental evaluations show a significant improvement over the baseline and comparable performance with other systems representing strong performance given that we restrict ourselves to features extracted from Wikipedia.

Original languageEnglish
Title of host publicationNatural Language Processing and Information Systems - 19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014, Proceedings
PublisherSpringer Verlag
Pages184-195
Number of pages12
ISBN (Print)9783319079820
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014 - Montpellier, France
Duration: 18 Jun 201420 Jun 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8455 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014
Country/TerritoryFrance
CityMontpellier
Period18/06/1420/06/14

Fingerprint

Dive into the research topics of 'Exploiting wikipedia for entity name disambiguation in tweets'. Together they form a unique fingerprint.

Cite this