Pathological Speech Classification Using a Convolutional Neural Network

Nam H. Trinh, Darragh O'Brien

Research output: Contribution to conferencePaperpeer-review

Abstract

Convolutional Neural Networks (CNNs) have enabled significant improvements across a number of applications in computer vision such as object detection, face recognition and image classification. An audio signal can be visually represented as a spectrogram that captures the time-varying frequency content of the signal. This paper describes how a CNN can be applied to the spectrogram of an audio signal to distinguish pathological from healthy speech. We propose a CNN structure and implement it using Keras to test the approach. A classification accuracy of over 95% is obtained in experiments on two public pathological speech datasets.
Original languageEnglish
DOIs
Publication statusPublished - 1 Jan 2019
Externally publishedYes
EventIMVIP 2019: Irish Machine Vision & Image Processing - Technological University Dublin, Dublin, Ireland
Duration: 28 Aug 201930 Aug 2019

Conference

ConferenceIMVIP 2019: Irish Machine Vision & Image Processing
Country/TerritoryIreland
CityDublin
Period28/08/1930/08/19

Keywords

  • Convolutional Neural Networks
  • CNNs
  • computer vision
  • object detection
  • face recognition
  • image classification
  • spectrogram
  • audio signal
  • pathological speech
  • healthy speech
  • Keras
  • classification accuracy

Fingerprint

Dive into the research topics of 'Pathological Speech Classification Using a Convolutional Neural Network'. Together they form a unique fingerprint.

Cite this