Registration for ICASSP is free of charge, but registration is required to view the videos. If you have not yet registered, please visit: https://cmsworkshops.com/ICASSP2020/Registration.asp.Access the full virtual conference by visiting: https://2020.ieeeicassp-virtual.org/attendee/login. Your username is your email address and your password is your confirmation number/registration ID.

You need an account to view media

Sign in to view media

Don't have an account? Please contact us to request an account.

Speech Processing
SPE-P12.4
Poster
Machine Learning for Speech Synthesis II

IMPROVING LPCNET-BASED TEXT-TO-SPEECH WITH LINEAR PREDICTION-STRUCTURED MIXTURE DENSITY NETWORK

Min-Jae Hwang

Date & Time

Thu, May 7, 2020

12:30 pm – 2:30 pm

Location

On-Demand

Abstract

In this paper, we propose an improved LPCNet vocoder using a linear prediction (LP)-structured mixture density network (MDN). The recently proposed LPCNet vocoder has successfully achieved high-quality and lightweight speech synthesis systems by combining a vocal tract LP filter with a WaveRNN-based vocal source (i.e., excitation) generator. However, the quality of synthesized speech is often unstable because the vocal source component is insufficiently represented by the mu-law quantization method, and the model is trained without considering the entire speech production mechanism. To address this problem, we first introduce LP-MDN, which enables the autoregressive neural vocoder to structurally represent the interactions between the vocal tract and vocal source components. Then, we propose to incorporate the LP-MDN to the LPCNet vocoder by replacing the conventional discretized output with continuous density distribution. The experimental results verify that the proposed system provides high quality synthetic speech by achieving a mean opinion score of 4.41 within a text-to-speech framework.


Presenter

Min-Jae Hwang

Yonsei university
Sign in to join the conversationDon't have an account? Please contact us to request an account.
Sign in to view documentsDon't have an account? Please contact us to request an account.

Session Chairs

Tomoki Toda

Nagoya University

Zhiyong Wu

Tsinghua University