Sign in

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

By Xiaofei Wang and others
Recent advancements in generative speech models based on audio-text prompts have enabled remarkable innovations like high-quality zero-shot text-to-speech. However, existing models still face limitations in handling diverse audio-text speech generation tasks involving transforming input speech and processing audio captured in adverse acoustic conditions. This paper introduces SpeechX, a versatile speech... Show more
June 25, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Click on play to start listening