A fast and feature-rich CTC beam search decoder for speech recognition with python
data:image/s3,"s3://crabby-images/48352/48352b426da5a6fe1ed2be99f74f1851d3593010" alt=""
pyctcdecode
A fast and feature-rich CTC beam search decoder for speech recognition written in Python, providing n-gram (kenlm) language model support similar to PaddlePaddle’s decoder, but incorporating many new features such as byte pair encoding and real-time decoding to support models like Nvidia’s Conformer-CTC or Facebook’s Wav2Vec2.
pip install pyctcdecode
Main Features:
- 🔥 hotword boosting
- 🤖 handling of BPE vocabulary
- 👥 multi-LM support for 2+ models
- 🕒 stateful LM for real-time decoding
- ✨ native frame index annotation of words
- 💨 fast runtime, comparable to C++ implementation
- 🐍 easy-to-modify Python code
Quick Start:
import kenlm
from pyctcdecode import build_ctcdecoder
# load trained kenlm model
kenlm_model = kenlm.Model("/my/dir/kenlm_model.binary")
# specify alphabet labels as they appear in logits
labels = [
" ", "a", "b",