Opinion Mining on Small and Noisy Samples of Health-Related Texts

Liliya Akhtyamova, Mikhail Alexandrov, John Cardiff, Oleksiy Koshulko

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The topic of people’s health has always attracted the attention of public and private structures, the patients themselves and, therefore, researchers. Social networks provide an immense amount of data for analysis of health-related issues; however it is not always the case that researchers have enough data to build sophisticated models. In the paper, we artificially create this limitation to test performance and stability of different popular algorithms on small samples of texts. There are two specificities in this research apart from the size of a sample: (a) here, instead of usual 5-star classification, we use combined classes reflecting a more practical view on medicines and treatments; (b) we consider both original and noisy data. The experiments were carried out using data extracted from the popular forum AskaPatient. For tuning parameters, GridSearchCV technique was used. The results show that in dealing with small and noisy data samples, GMDH Shell is superior to other methods. The work has a practical orientation.

Original languageEnglish
Title of host publicationAdvances in Intelligent Systems and Computing III - Selected Papers from the International Conference on Computer Science and Information Technologies, CSIT 2018
EditorsMykola O. Medykovskyy, Natalia Shakhovska
PublisherSpringer Verlag
Pages379-390
Number of pages12
ISBN (Print)9783030010683
DOIs
Publication statusPublished - 2019
EventInternational Conference on Computer Science and Information Technologies, CSIT 2018 - Lviv, Ukraine
Duration: 11 Sep 201814 Sep 2018

Publication series

NameAdvances in Intelligent Systems and Computing
Volume871
ISSN (Print)2194-5357

Conference

ConferenceInternational Conference on Computer Science and Information Technologies, CSIT 2018
Country/TerritoryUkraine
CityLviv
Period11/09/1814/09/18

Keywords

  • Classification
  • GMDH
  • Health social networks
  • Noise immunity
  • Unbalanced data

Fingerprint

Dive into the research topics of 'Opinion Mining on Small and Noisy Samples of Health-Related Texts'. Together they form a unique fingerprint.

Cite this