Deploying Speech-to-Speech on Hugging Face
Speech-to-Speech (S2S) is an exciting new project from Hugging Face that combines several advanced models to create a seamless, almost magical experience: you speak, and the system responds with a synthesized voice. The project implements a cascaded pipeline leveraging models available through the Transformers library on the Hugging Face hub. The pipeline consists of the following components: Voice Activity Detection (VAD) Speech to Text (STT) Language Model (LM) Text to Speech (TTS) What’s more, S2S has multi-language support! It currently […]
Read more