AI Pronunciation Trainer

See my fork of AI Pronunciation Trainer repository for more details.

Models and variables

Right now this tool uses:

  • faster_whisper as STT (speech-to-text) model; other supported models are:

  • 48000 as input samplerate value (from empirical tests the best sample rate value is 48000)

  • 16000 as resampled samplerate value

  • 16000 as TTS (text-to-speech) samplerate value

Language
Difficulty
Examples
Phrase to read for speech recognition Language Difficulty

Speech accuracy output

-