TY - GEN
T1 - Rapid Unsupervised Keyphrase Extraction from Single Document
AU - Popova, Svetlana
AU - Cardiff, John
AU - Danilova, Vera
N1 - Publisher Copyright:
© 2024 FRUCT Oy.
PY - 2024
Y1 - 2024
N2 - Keyphrases offer a concise representation of a document's content. They are valuable for improving web search results and enhancing tasks such as document tagging, text classification, or summarization. This makes keyphrase extraction is an essential component of text mining. Among the widely used constraints and features in existing keyphrase extraction methods, we identified several effective techniques that have not yet been used together: Part-of-Speech (PoS) restrictions, extended stop-word lists, and position-based features. To address this gap, we propose an approach that leverages automatically extracted extended stop word lists combined with PoS restrictions in keyphrases, and incorporates positional criteria. The main goal of the work was to develop a fast keyphrase extraction algorithm, which was built upon the three mentioned features. Experimental results on the INSPEC and SemEval 2010 datasets demonstrate the effectiveness of the proposed method.
AB - Keyphrases offer a concise representation of a document's content. They are valuable for improving web search results and enhancing tasks such as document tagging, text classification, or summarization. This makes keyphrase extraction is an essential component of text mining. Among the widely used constraints and features in existing keyphrase extraction methods, we identified several effective techniques that have not yet been used together: Part-of-Speech (PoS) restrictions, extended stop-word lists, and position-based features. To address this gap, we propose an approach that leverages automatically extracted extended stop word lists combined with PoS restrictions in keyphrases, and incorporates positional criteria. The main goal of the work was to develop a fast keyphrase extraction algorithm, which was built upon the three mentioned features. Experimental results on the INSPEC and SemEval 2010 datasets demonstrate the effectiveness of the proposed method.
UR - https://www.scopus.com/pages/publications/85210841188
U2 - 10.23919/FRUCT64283.2024.10749871
DO - 10.23919/FRUCT64283.2024.10749871
M3 - Conference contribution
AN - SCOPUS:85210841188
T3 - Conference of Open Innovation Association, FRUCT
SP - 609
EP - 616
BT - Proceedings of the 36th Conference of Open Innovations Association FRUCT, FRUCT 2024
A2 - Khlaponin, Yurii
A2 - Balandin, Sergey
PB - IEEE Computer Society
T2 - 36th Conference of Open Innovations Association FRUCT, FRUCT 2024
Y2 - 30 October 2024 through 1 November 2024
ER -