Named Entity Recognition in Spanish Biomedical Literature: Short Review and Bert Model

Liliya Akhtyamova

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    Entity Recognition (NER) is the first step for knowledge acquisition when we deal with an unknown corpus of texts. Having received these entities, we have an opportunity to form parameters space and to solve problems of text mining as concept normalization, speech recognition, etc. The recent advances in NER are related to the technology of contextualized word embeddings, which transforms text to the form being effective for Deep Learning. In the paper, we show how NER model detects pharmacological substances, compounds, and proteins in the dataset obtained from the Spanish Clinical Case Corpus (SPACCC). To achieve this goal, we train from scratch the BERT language representation model and fine-tune it for our problem. As it is expected, this model shows better results than the NER model trained over the standard word embeddings. We further conduct an error analysis showing the origins of models' errors and proposing strategies to further improve the model's quality.

    Original languageEnglish
    Title of host publicationProceedings of the 26th Conference of Open Innovations Association FRUCT, FRUCT 2020
    EditorsSergey Balandin, Ilya Paramonov, Tatiana Tyutina
    PublisherIEEE Computer Society
    Pages3-9
    Number of pages7
    ISBN (Electronic)9789526924427
    DOIs
    Publication statusPublished - Apr 2020
    Event26th Conference of Open Innovations Association FRUCT, FRUCT 2020 - Yaroslavl, Russian Federation
    Duration: 23 Apr 202024 Apr 2020

    Publication series

    NameConference of Open Innovation Association, FRUCT
    Volume2020-April
    ISSN (Print)2305-7254

    Conference

    Conference26th Conference of Open Innovations Association FRUCT, FRUCT 2020
    Country/TerritoryRussian Federation
    CityYaroslavl
    Period23/04/2024/04/20

    Fingerprint

    Dive into the research topics of 'Named Entity Recognition in Spanish Biomedical Literature: Short Review and Bert Model'. Together they form a unique fingerprint.

    Cite this