Gemma 3 12B

Google

Multimodal Gemma model with text + image understanding. Part of the most-downloaded family on Ollama (33M+ pulls). Sweet spot between quality and hardware requirements.

Text Generation Local Gemma Family v3

Documentation Back to Models

Parameters

12B

params

Context Window

131K

tokens

Max Output

tokens

Input Price

per 1M tokens

Output Price

per 1M tokens

Gemma Family 7 models

The full Gemma line by generation — pricing and capabilities vary across the family.

Google

FunctionGemma

Native function calling for on-device agents. Routes complex tasks to larger models. Optimized for edge deployment.

context

Dec 2025

2 9B

General purpose, balanced

context

Jun 2024

2 2B

Edge devices, fast inference

context

Jun 2024

2 27B

High quality generation

context

Jun 2024

3 1B (llama.cpp)

Fast AI assistant for chat, code generation, and reasoning tasks

context

Feb 2024

3 12B Current

General-purpose, multimodal, coding

131K

context

Mar 2025

3 27B Latest

Complex reasoning, multimodal, research

131K

context

Mar 2025

Capabilities

👁️

Vision

⚡

Function Calling

📋

JSON Mode

🌊

Streaming

💬

System Prompt

🖥️

Code Execution

🔍

Web Search

🔌

MCP Support

Local Model Specs

Quantization

Q4_K_M

Architecture

Dense Transformer

Runtime

Ollama / llama.cpp

Disk Size

8.1 GB

Details

Release Date: March 12, 2025
Knowledge Cutoff: -
Source: Local
License: Gemma Terms of Use
Model ID: gemma3-12b

Last updated: March 13, 2026