Vietnamese Streaming RNN-T
Visit https://github.com/HKAB/vietnamese-rnnt-tutorial/ for more information.
- This model runs on a very slow CPU (it's Free tier) so the RTF of FP32 model is around 1.5 (which means it will take 1.5 times the duration of the audio to process it).
- This model mights not work with your microphone since it was trained on a quite clean dataset. Try to speak loudly 😃
- Although you upload a full audio file, the model will process it in a streaming fashion.
Cherry-picked examples
Upload from disk | Model type |
---|