Validation of tagging suggestion models for a hotel ticketing corpus

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

This paper investigates methods for the prediction of tags on a textual corpus that describes hotel staff inputs in a ticketing system. The aim is to improve the tagging process and find the most suitable method for suggesting tags for a new text entry. The paper consists of two parts: (i) exploration of existing sample data, which includes statistical analysis and visualisation of the data to provide an overview, and (ii) evaluation of tag prediction approaches. We have included different approaches from different research fields in order to cover a broad spectrum of possible solutions. As a result, we have tested a machine learning model for multi-label classification (using gradient boosting), a statistical approach (using frequency heuristics), and two simple similarity-based classification approaches (Nearest Centroid and k-Nearest Neighbours). The experiment which compares the approaches uses recall to measure the quality of results. Finally, we provide a recommendation of the modelling approach which produces the best accuracy in terms of tag prediction on the sample data.

Original languageEnglish
Title of host publication20th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2018 - Proceedings
EditorsGabriele Anderst-Kotsis, Eric Pardede, Matthias Steinbauer, Maria Indrawan-Santiago, Ivan Luiz Salvadori, Ivan Luiz Salvadori, Ismail Khalil
PublisherAssociation for Computing Machinery
Pages15-23
Number of pages9
ISBN (Electronic)9781450364799
DOIs
Publication statusPublished - 19 Nov 2018
Event20th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2018 - Yogyakarta, Indonesia
Duration: 19 Nov 201821 Nov 2018

Publication series

NameACM International Conference Proceeding Series

Conference

Conference20th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2018
Country/TerritoryIndonesia
CityYogyakarta
Period19/11/1821/11/18

Keywords

  • K-Nearest Neighbour
  • Multi-label Classification
  • Natural Language Processing
  • Tag Prediction

Fingerprint

Dive into the research topics of 'Validation of tagging suggestion models for a hotel ticketing corpus'. Together they form a unique fingerprint.

Cite this