Abstract
This paper presents work on a real-time temporal clipping monitoring tool for VoIP. Temporal clipping can occur as a result of voice activity detection (VAD) or echo cancellation where comfort noise in used in place of clipped speech segments. The algorithm presented will form part of a no-reference objective model for quantifying perceived speech quality in VoIP. The overall approach uses a modular design that will help pinpoint the reason for degradations in addition to quantifying their impact on speech quality. The new algorithm was tested for VAD compared over a range of thresholds and varied speech frame sizes. The results are compared to objective Mean Opinion Scores (MOS-LQO) from POLQA. The results show that the proposed algorithm can efficiently predict temporal clipping in speech and correlates well with the full reference quality predictions from POLQA. The model shows good potential for use in a real-time monitoring tool.
Original language | English |
---|---|
DOIs | |
Publication status | Published - 2013 |
Event | Interspeech 2013 - Lyon, France Duration: 25 Aug 2013 → 29 Aug 2013 |
Conference
Conference | Interspeech 2013 |
---|---|
Country/Territory | France |
City | Lyon |
Period | 25/08/13 → 29/08/13 |
Keywords
- real-time temporal clipping monitoring tool
- VoIP
- voice activity detection
- echo cancellation
- comfort noise
- no-reference objective model
- perceived speech quality
- modular design
- degradations
- speech frame sizes
- Mean Opinion Scores
- POLQA
- real-time monitoring tool