Classifiers for Yelp-Reviews Based on GMDH-Algorithms

  • Mikhail Alexandrov
  • , Gabriella Skitalinskaya
  • , John Cardiff
  • , Olexiy Koshulko
  • , Elena Shushkevich

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Yelp is one of the most popular international web resources about products and services that provide users with useful information on local businesses and helps the business owners to make their business more attractive for the users. The Yelp dataset consists of attributes for describing the business, reviews in free text form and numeric star ratings out of 5. The utility of such a dataset has provoked dozens of publications related to classifiers of ratings, which used various smart tools of opinion mining. Unlike them, in this paper we propose to use simpler approaches, namely: (a) selection of descriptors based on term specificity, and (b) formation of classifiers with these descriptors based on inductive modeling. The latter is implemented by the well-known tool GMDH Shell, where GMDH stands for Group Method of Data Handling. This method allows us to build models with high noise immunity. We compare 96 prediction models with identified descriptors by combining various variants: (i) preprocessing with data transformation and balancing classes, (ii) algorithms of classification; and (iii) post processing with ensembling. Instead of the typical 5- star classification we consider combined classes reflecting a more practical view on purchase of goods or development of business. The experiments refer to the most popular categories of business: restaurants and shopping. To evaluate the quality of classifiers we consider the results of predecessors, and we also introduce the so-called defensible accuracy. With this comparison the results presented in the paper prove to be promising.

Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 19th International Conference, CICLing 2018, Revised Selected Papers
EditorsAlexander Gelbukh
PublisherSpringer Science and Business Media Deutschland GmbH
Pages412-430
Number of pages19
ISBN (Print)9783031238031
DOIs
Publication statusPublished - 2023
Event19th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2018 - Hanoi, Viet Nam
Duration: 18 Mar 201824 Mar 2018

Publication series

NameLecture Notes in Computer Science
Volume13397 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2018
Country/TerritoryViet Nam
CityHanoi
Period18/03/1824/03/18

Keywords

  • GMDH
  • GMDH shell
  • Opinion mining
  • Text mining
  • Yelp

Fingerprint

Dive into the research topics of 'Classifiers for Yelp-Reviews Based on GMDH-Algorithms'. Together they form a unique fingerprint.

Cite this