Abstract
There are many types of degradation which can occur in Voice over IP (VoIP) calls. Of interest in this work are degradations which occur independently of the codec, hardware or network in use. Specifically, their effect on the subjective and objec- tive quality of the speech is examined. Since no dataset suit- able for this purpose exists, a new dataset (TCD-VoIP) has been created and has been made publicly available. The dataset con- tains speech clips suffering from a range of common call qual- ity degradations, as well as a set of subjective opinion scores on the clips from 24 listeners. The performances of three ob- jective quality metrics: POLQA, ViSQOL and P.563, have been evaluated using the dataset. The results show that full reference metrics are capable of accurately predicting a variety of com- mon VoIP degradations. They also highlight the outstanding need for a wideband, single-ended, no-reference metric to mon- itor accurately speech quality for degradations common in VoIP scenarios.
Original language | English |
---|---|
DOIs | |
Publication status | Published - 2015 |
Event | Interspeech Conference - Dresden, Germany Duration: 6 Sep 2015 → 10 Sep 2015 |
Conference
Conference | Interspeech Conference |
---|---|
Country/Territory | Germany |
City | Dresden |
Period | 6/09/15 → 10/09/15 |
Keywords
- Voice over IP
- VoIP
- degradations
- codec
- hardware
- network
- subjective quality
- objective quality
- TCD-VoIP
- speech clips
- call quality
- subjective opinion scores
- objective quality metrics
- POLQA
- ViSQOL
- P.563
- full reference metrics
- wideband
- single-ended
- no-reference metric
- speech quality