AI Models Database
173 models across text, image, audio & embedding
2 models
Dia 1.6B
LatestTTS Local
Multi-speaker dialogue TTS with non-verbal sounds (laughs, sighs, coughs). Voice cloning via audio prompt conditioning. Best model for scripted dialogue and podcast generation.
Source
Local
Released
Mar 2025
Kokoro 82M
LatestTTS Local
Ultra-lightweight TTS model. Under $1 per million characters. 54 pre-built voices across 8 languages. Apache 2.0 for commercial deployment. 8.9M+ monthly HuggingFace downloads.
Source
Local
Released
Jan 2025