Attention Based Grapheme To Phoneme with python
G2P
The G2P algorithm is used to generate the most probable pronunciation for a word not contained in the lexicon dictionary. It could be used as a preprocess of text-to-speech system to generate pronunciation for OOV words.
Dependencies
The following libraries are used:
pytorch
tqdm
matplotlib
Install dependencies using pip:
pip3 install -r requirements.txt
Dataset
Currently the following languages are supported:
- EN: English
- FA: Farsi
- RU: Russian
You could easily provide and use your own language specific pronunciatin doctionary for training G2P. More details about data preparation and contribution could be found in resources
.
Feel free to provide resources for other languages.
Attention Model
Both encoder-decoder seq2seq model and attention model could handle G2P problem. Here we train attention based model.
The