A fast and feature-rich CTC beam search decoder for speech recognition with python

pyctcdecode

A fast and feature-rich CTC beam search decoder for speech recognition written in Python, providing n-gram (kenlm) language model support similar to PaddlePaddle’s decoder, but incorporating many new features such as byte pair encoding and real-time decoding to support models like Nvidia’s Conformer-CTC or Facebook’s Wav2Vec2.

pip install pyctcdecode

Main Features:

  • 🔥 hotword boosting
  • 🤖 handling of BPE vocabulary
  • 👥 multi-LM support for 2+ models
  • 🕒 stateful LM for real-time decoding
  • ✨ native frame index annotation of words
  • 💨 fast runtime, comparable to C++ implementation
  • 🐍 easy-to-modify Python code

Quick Start:

import kenlm
from pyctcdecode import build_ctcdecoder

# load trained kenlm model
kenlm_model = kenlm.Model("/my/dir/kenlm_model.binary")

# specify alphabet labels as they appear in logits
labels = [
" ", "a", "b",

 

 

 

To finish reading, please visit source site