A Systematic Mapping Study of Language Features Identification from Large Text Collection

Diellza Nagavci Mati, Jaumin Ajdari, Bujar Raufi, Mentor Hamiti, Besnik Selimi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Natural Language Processing11Henceforth: NLP is an emerging research area in today's era. The NLP resources are quite useful when it comes to building a machine capable of translating between linguistic pairs - a solution that strives to resolve the language barrier problems. Based on this premise, we are focusing our research on feature identification from large text collections of Albanian language. 'Rule-based' or statistical Part-of-Speech22Henceforth: POS (POS) taggers are sought to be utilized that would either need considerable time for rule development or a sufficient amount of manually labelled data. In light of this, the impact of this research is based on exploring numerous cases that are conducive to progress and further development of this field. One of the goals of this paper is to conduct a systematic review study; to explore and analyze existing research that seek to target low resources language such as is the case of the Albanian language. According to prior observation of published research conducted since 2015, we are focusing our research on studies that have been published in areas that are relevant to Natural Language Processing. Based on considerable load of related research on this field, it is essential to conduct a review and provide an outline of the research situation as well as current developments in this specific but important field of research.

Original languageEnglish
Title of host publication2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings
EditorsRadovan Stojanovic, Lech Jozwiak, Budimir Lutovac, Drazen Jurisic
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728117393
DOIs
Publication statusPublished - Jun 2019
Externally publishedYes
Event8th Mediterranean Conference on Embedded Computing, MECO 2019 - Budva, Montenegro
Duration: 10 Jun 201914 Jun 2019

Publication series

Name2019 8th Mediterranean Conference on Embedded Computing, MECO 2019 - Proceedings

Conference

Conference8th Mediterranean Conference on Embedded Computing, MECO 2019
Country/TerritoryMontenegro
CityBudva
Period10/06/1914/06/19

Keywords

  • Algorithms
  • Chinese Whispers
  • Clustering
  • component
  • Machine Learning
  • Natural Language Processing
  • Part-of-Speech

Fingerprint

Dive into the research topics of 'A Systematic Mapping Study of Language Features Identification from Large Text Collection'. Together they form a unique fingerprint.

Cite this