Obtaining speech assets for judgement analysis on low-pass filtered emotional speech

John Snel, Charlie Cullen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Investigating the emotional content in speech from acoustic characteristics requires separating the semantic content from the acoustic channel. For natural emotional speech, a widely used method to separate the two channels is the use of cue masking. Our objective is to investigate the use of cue masking in non-acted emotional speech by analyzing the extent to which filtering impacts the perception of emotional content of the modified speech material. However, obtaining a corpus of emotional speech can be quite difficult whereby verifying the emotional content is an issue thoroughly discussed. Currently, speech research is showing a tendency toward constructing corpora of natural emotion expression. In this paper we outline the procedure used to obtain the corpus containing high audio quality and natural emotional speech. We review the use of Mood Induction Procedures which provides a method to obtain spontaneous emotional speech in a controlled environment. Following this, we propose an experiment to investigate the effects of cue masking on natural emotional speech.

Original languageEnglish
Title of host publication2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG)
Pages835-840
Number of pages6
DOIs
Publication statusPublished - 2011
Event2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, FG 2011 - Santa Barbara, CA, United States
Duration: 21 Mar 201125 Mar 2011

Publication series

Name2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, FG 2011

Conference

Conference2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops, FG 2011
Country/TerritoryUnited States
CitySanta Barbara, CA
Period21/03/1125/03/11

Fingerprint

Dive into the research topics of 'Obtaining speech assets for judgement analysis on low-pass filtered emotional speech'. Together they form a unique fingerprint.

Cite this