PyTorch implementations of neural network models for keyword spotting
Honk: CNNs for Keyword Spotting
Honk is a PyTorch reimplementation of Google’s TensorFlow convolutional neural networks for keyword spotting, which accompanies the recent release of their Speech Commands Dataset. For more details, please consult our writeup:
- Raphael Tang, Jimmy Lin. Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting. arXiv:1710.06554, October 2017.
- Raphael Tang, Jimmy Lin. Deep Residual Learning for Small-Footprint Keyword Spotting. Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5479-5483.
Honk is useful for building on-device speech recognition capabilities for interactive intelligent agents. Our code can be used to identify simple commands (e.g., “stop” and “go”) and be adapted to detect custom “command triggers” (e.g., “Hey Siri!”).
Check out this video for a demo of Honk in action!