TY - GEN
T1 - Opinion Mining on Small and Noisy Samples of Health-Related Texts
AU - Akhtyamova, Liliya
AU - Alexandrov, Mikhail
AU - Cardiff, John
AU - Koshulko, Oleksiy
N1 - Publisher Copyright:
© 2019, Springer Nature Switzerland AG.
PY - 2019
Y1 - 2019
N2 - The topic of people’s health has always attracted the attention of public and private structures, the patients themselves and, therefore, researchers. Social networks provide an immense amount of data for analysis of health-related issues; however it is not always the case that researchers have enough data to build sophisticated models. In the paper, we artificially create this limitation to test performance and stability of different popular algorithms on small samples of texts. There are two specificities in this research apart from the size of a sample: (a) here, instead of usual 5-star classification, we use combined classes reflecting a more practical view on medicines and treatments; (b) we consider both original and noisy data. The experiments were carried out using data extracted from the popular forum AskaPatient. For tuning parameters, GridSearchCV technique was used. The results show that in dealing with small and noisy data samples, GMDH Shell is superior to other methods. The work has a practical orientation.
AB - The topic of people’s health has always attracted the attention of public and private structures, the patients themselves and, therefore, researchers. Social networks provide an immense amount of data for analysis of health-related issues; however it is not always the case that researchers have enough data to build sophisticated models. In the paper, we artificially create this limitation to test performance and stability of different popular algorithms on small samples of texts. There are two specificities in this research apart from the size of a sample: (a) here, instead of usual 5-star classification, we use combined classes reflecting a more practical view on medicines and treatments; (b) we consider both original and noisy data. The experiments were carried out using data extracted from the popular forum AskaPatient. For tuning parameters, GridSearchCV technique was used. The results show that in dealing with small and noisy data samples, GMDH Shell is superior to other methods. The work has a practical orientation.
KW - Classification
KW - GMDH
KW - Health social networks
KW - Noise immunity
KW - Unbalanced data
UR - http://www.scopus.com/inward/record.url?scp=85057843122&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-01069-0_27
DO - 10.1007/978-3-030-01069-0_27
M3 - Conference contribution
AN - SCOPUS:85057843122
SN - 9783030010683
T3 - Advances in Intelligent Systems and Computing
SP - 379
EP - 390
BT - Advances in Intelligent Systems and Computing III - Selected Papers from the International Conference on Computer Science and Information Technologies, CSIT 2018
A2 - Medykovskyy, Mykola O.
A2 - Shakhovska, Natalia
PB - Springer Verlag
T2 - International Conference on Computer Science and Information Technologies, CSIT 2018
Y2 - 11 September 2018 through 14 September 2018
ER -