Llama 4 Scout
Meta
Efficient MoE multimodal model with industry-leading 10M token context window. 109B total / 17B active parameters across 16 experts. Native text + image understanding. Trained on 40T tokens in 12 languages.
Text Generation Local Latest Llama Family v4
Parameters
109B
params
Context Window
10.0M
tokens
Max Output
-
tokens
Input Price
-
per 1M tokens
Output Price
-
per 1M tokens
Llama Family 8 models
The full Llama line by generation — pricing and capabilities vary across the family.
4
4 Scout
Long context, 16 experts MoE
$0.5 / $1.5
in / out · 1M
4 Scout Current Latest
Long-context, multimodal, general-purpose
10.0M
context
4 Maverick Latest
High capability, 128 experts MoE
$1 / $3
in / out · 1M
Meta
3.1
Capabilities
👁️
Vision
⚡
Function Calling
📋
JSON Mode
🌊
Streaming
💬
System Prompt
🖥️
Code Execution
🔍
Web Search
🔌
MCP Support
Local Model Specs
Quantization
Q4_K_M
Architecture
MoE (16 experts, 17B active)
Runtime
Ollama / llama.cpp
Disk Size
67 GB
Details
- Release Date
- April 5, 2025
- Knowledge Cutoff
- -
- Source
- Local
- License
- Llama 4 Community
- Model ID
- llama4-scout-local
Last updated: March 13, 2026