Gemma 3 12B
Multimodal Gemma model with text + image understanding. Part of the most-downloaded family on Ollama (33M+ pulls). Sweet spot between quality and hardware requirements.
Text Generation Local Gemma Family v3
Parameters
12B
params
Context Window
131K
tokens
Max Output
-
tokens
Input Price
-
per 1M tokens
Output Price
-
per 1M tokens
Gemma Family 7 models
The full Gemma line by generation — pricing and capabilities vary across the family.
Google
FunctionGemma
Native function calling for on-device agents. Routes complex tasks to larger models. Optimized for edge deployment.
8K
context
2 9B
General purpose, balanced
8K
context
2 2B
Edge devices, fast inference
8K
context
2 27B
High quality generation
8K
context
3 1B (llama.cpp)
Fast AI assistant for chat, code generation, and reasoning tasks
4K
context
3
3 12B Current
General-purpose, multimodal, coding
131K
context
3 27B Latest
Complex reasoning, multimodal, research
131K
context
Capabilities
👁️
Vision
⚡
Function Calling
📋
JSON Mode
🌊
Streaming
💬
System Prompt
🖥️
Code Execution
🔍
Web Search
🔌
MCP Support
Local Model Specs
Quantization
Q4_K_M
Architecture
Dense Transformer
Runtime
Ollama / llama.cpp
Disk Size
8.1 GB
Details
- Release Date
- March 12, 2025
- Knowledge Cutoff
- -
- Source
- Local
- License
- Gemma Terms of Use
- Model ID
- gemma3-12b
Last updated: March 13, 2026