TY - GEN
T1 - Exploiting wikipedia for entity name disambiguation in tweets
AU - Qureshi, Muhammad Atif
AU - O'Riordan, Colm
AU - Pasi, Gabriella
PY - 2014
Y1 - 2014
N2 - Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts are in need of automated methods for mining the social media repositories (in particular Twitter) to monitor the reputation of a particular entity. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to entity names. To address this issue in this paper we use "context phrases" in a tweet and Wikipedia disambiguated articles for a particular entity in a random forest classifier. Furthermore, we also utilize the concept of "relatedness" between tweet and entity using the Wikipedia category-article structure that captures the amount of discussion present inside a tweet related to an entity. The experimental evaluations show a significant improvement over the baseline and comparable performance with other systems representing strong performance given that we restrict ourselves to features extracted from Wikipedia.
AB - Social media repositories serve as a significant source of evidence when extracting information related to the reputation of a particular entity (e.g., a particular politician, singer or company). Reputation management experts are in need of automated methods for mining the social media repositories (in particular Twitter) to monitor the reputation of a particular entity. A quite significant research challenge related to the above issue is to disambiguate tweets with respect to entity names. To address this issue in this paper we use "context phrases" in a tweet and Wikipedia disambiguated articles for a particular entity in a random forest classifier. Furthermore, we also utilize the concept of "relatedness" between tweet and entity using the Wikipedia category-article structure that captures the amount of discussion present inside a tweet related to an entity. The experimental evaluations show a significant improvement over the baseline and comparable performance with other systems representing strong performance given that we restrict ourselves to features extracted from Wikipedia.
UR - http://www.scopus.com/inward/record.url?scp=84958536489&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-07983-7_25
DO - 10.1007/978-3-319-07983-7_25
M3 - Conference contribution
AN - SCOPUS:84958536489
SN - 9783319079820
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 184
EP - 195
BT - Natural Language Processing and Information Systems - 19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014, Proceedings
PB - Springer Verlag
T2 - 19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014
Y2 - 18 June 2014 through 20 June 2014
ER -