Moving Targets: Addressing Concept Drift in Supervised Models for Hacker Communication Detection

Andrei Lima Queiroz, Brian Keegan, Susan McKeever

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

In this paper, we are investigating the presence of concept drift in machine learning models for detection of hacker communications posted in social media and hacker forums. The supervised models in this experiment are analysed in terms of performance over time by different sources of data (Surface web and Deep web). Additionally, to simulate real-world situations, these models are evaluated using time-stamped messages from our datasets, posted over time on social media platforms. We have found that models applied to hacker forums (deep web) presents an accuracy deterioration in less than a 1-year period, whereas models applied to Twitter (surface web) have not shown a decrease in accuracy for the same period of time. The problem is alleviated by retraining the model with new instances (and applying weights) in order to reduce the effects of concept drift. While our results indicated that performance degradation due to concept drift is avoided by 50% relabelling, which is challenging in real-world scenarios, our work paves the way to more targeted concept drift solutions to reduce the re-training tasks.

Original languageEnglish
Title of host publicationInternational Conference on Cyber Security and Protection of Digital Services, Cyber Security 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728164281
DOIs
Publication statusPublished - Jun 2020
Externally publishedYes
Event2020 International Conference on Cyber Security and Protection of Digital Services, Cyber Security 2020 - Virtual, Online, Ireland
Duration: 15 Jun 202019 Jun 2020

Publication series

NameInternational Conference on Cyber Security and Protection of Digital Services, Cyber Security 2020

Conference

Conference2020 International Conference on Cyber Security and Protection of Digital Services, Cyber Security 2020
Country/TerritoryIreland
CityVirtual, Online
Period15/06/2019/06/20

Keywords

  • Concept Drift
  • Cyber Security
  • Hacker Communication
  • Machine Learning
  • Software Vulnerabilities

Fingerprint

Dive into the research topics of 'Moving Targets: Addressing Concept Drift in Supervised Models for Hacker Communication Detection'. Together they form a unique fingerprint.

Cite this