Using semi-supervised classifiers for credit scoring

K. Kennedy, B. Mac Namee, S. J. Delany

Research output: Contribution to journalArticlepeer-review

Abstract

In credit scoring, low-default portfolios (LDPs) are those for which very little default history exists. This makes it problematic for financial institutions to estimate a reliable probability of a customer defaulting on a loan. Banking regulation (Basel II Capital Accord), and best practice, however, necessitate an accurate and valid estimate of the probability of default. In this article the suitability of semi-supervised one-class classification (OCC) algorithms as a solution to the LDP problem is evaluated. The performance of OCC algorithms is compared with the performance of supervised two-class classification algorithms. This study also investigates the suitability of over sampling, which is a common approach to dealing with LDPs. Assessment of the performance of one-and two-class classification algorithms using nine real-world banking data sets, which have been modified to replicate LDPs, is provided. Our results demonstrate that only in the near or complete absence of defaulters should semi-supervised OCC algorithms be used instead of supervised two-class classification algorithms. Furthermore, we demonstrate for data sets whose class labels are unevenly distributed that optimising the threshold value on classifier output yields, in many cases, an improvement in classification performance. Finally, our results suggest that oversampling produces no overall improvement to the best performing two-class classification algorithms.

Original languageEnglish
Pages (from-to)513-529
Number of pages17
JournalJournal of the Operational Research Society
Volume64
Issue number4
DOIs
Publication statusPublished - Apr 2013

Keywords

  • banking
  • benchmarking
  • credit scoring
  • low-default portfolio
  • one-class classification
  • supervised classification

Fingerprint

Dive into the research topics of 'Using semi-supervised classifiers for credit scoring'. Together they form a unique fingerprint.

Cite this