Dia 1.6B

Nari Labs

Multi-speaker dialogue TTS with non-verbal sounds (laughs, sighs, coughs). Voice cloning via audio prompt conditioning. Best model for scripted dialogue and podcast generation.

Text-to-Speech Local Latest Dia Family v1.0

Documentation Back to Models

Type

Text-to-Speech

Source

Local

License

Apache 2.0

Capabilities

🌊

Streaming

Local Model Specs

Architecture

Transformer + Descript Audio Codec

Runtime

Python / torch

VRAM Usage

10 GB

Disk Size

3.2 GB

Details

Release Date: March 1, 2025
Knowledge Cutoff: -
Source: Local
License: Apache 2.0
Model ID: dia-1.6b

Last updated: March 13, 2026