Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
ContextNet ContextNet has CNN-RNN-transducer architecture and features a fully convolutional encoder that incorporates global context information into convolution layers by adding squeeze-and-excitation modules.Also, ContextNet supports three size models: small, medium, and large. ContextNet uses the global parameter alpha to control the scaling of the model by changing the number of channels in the convolution filter. This repository contains only model code, but you can train with ContextNet at openspeech. Model Architecuture Configuration of the ContextNet encoder If you choose the […]
Read more