Sign in

DiariST: Streaming Speech Translation with Speaker Diarization

By Mu Yang and others at
LogoUniversity of Texas at Dallas
End-to-end speech translation (ST) for conversation recordings involves several under-explored challenges such as speaker diarization (SD) without accurate word time stamps and handling of overlapping speech in a streaming fashion. In this work, we propose DiariST, the first streaming ST and SD solution. It is built upon a neural transducer-based... Show more
January 22, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
DiariST: Streaming Speech Translation with Speaker Diarization
Click on play to start listening