Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

ContextNet ContextNet has CNN-RNN-transducer architecture and features a fully convolutional encoder that incorporates global context information into convolution layers by adding squeeze-and-excitation modules.Also, ContextNet supports three size models: small, medium, and large. ContextNet uses the global parameter alpha to control the scaling of the model by changing the number of channels in the convolution filter. This repository contains only model code, but you can train with ContextNet at openspeech. Model Architecuture Configuration of the ContextNet encoder If you choose the […]

The neural network model for automatic speech recognition with PyTorch

End to End Automatic Speech Recognition In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. The Neural Acoustic model is built with reference to the DeepSpeech2 model, but not the exact DeepSpeach2 model or the DeepSpeech model as mentioned in their respective research papers. Technologies […]