Cross-Utterance Speech Generation

Coherent speech generation and editing with cross-utterance conditioned latent models.

This project studies non-autoregressive text-to-speech and speech editing methods that preserve coherence across utterances.