Llama 4 Scout

Meta

Efficient MoE multimodal model with industry-leading 10M token context window. 109B total / 17B active parameters across 16 experts. Native text + image understanding. Trained on 40T tokens in 12 languages.

Text Generation Local Latest Llama Family v4

Documentation Back to Models

Parameters

109B

params

Context Window

10.0M

tokens

Max Output

-

tokens

Input Price

-

per 1M tokens

Output Price

-

per 1M tokens

Capabilities

👁️

Vision

⚡

Function Calling

📋

JSON Mode

🌊

Streaming

💬

System Prompt

🖥️

Code Execution

🔍

Web Search

🔌

MCP Support

Local Model Specs

Quantization

Q4_K_M

Architecture

MoE (16 experts, 17B active)

Runtime

Ollama / llama.cpp

Disk Size

67 GB

Details

Release Date: April 5, 2025
Knowledge Cutoff: -
Source: Local
License: Llama 4 Community
Model ID: llama4-scout-local

Last updated: March 13, 2026