Drift detection using uncertainty distribution divergence

Patrick Lindstrom, Brian Mac Namee, Sarah Jane Delany

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Concept drift is believed to be prevalent in most data gathered from naturally occurring processes and thus warrants research by the machine learning community. There are a myriad of approaches to concept drift handling which have been shown to handle concept drift with varying degrees of success. However, most approaches make the key assumption that the labelled data will be available at no labelling cost shortly after classification, an assumption which is often violated. The high labelling cost in many domains provides a strong motivation to reduce the number of labelled instances required to handle concept drift. Explicit detection approaches that do not require labelled instances to detect concept drift show great promise for achieving this. Our approach Confidence Distribution Batch Detection (CDBD) provides a signal correlated to changes in concept without using labelled data. We also show how this signal combined with a trigger and a rebuild policy can maintain classifier accuracy while using a limited amount of labelled data.

Original languageEnglish
Title of host publicationProceedings - 11th IEEE International Conference on Data Mining Workshops, ICDMW 2011
Pages604-608
Number of pages5
DOIs
Publication statusPublished - 2011
Event11th IEEE International Conference on Data Mining Workshops, ICDMW 2011 - Vancouver, BC, Canada
Duration: 11 Dec 201111 Dec 2011

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference11th IEEE International Conference on Data Mining Workshops, ICDMW 2011
Country/TerritoryCanada
CityVancouver, BC
Period11/12/1111/12/11

Keywords

  • Classifier confidence
  • Concept drift
  • Explicit drift detection
  • Labelling cost

Fingerprint

Dive into the research topics of 'Drift detection using uncertainty distribution divergence'. Together they form a unique fingerprint.

Cite this