Qwen 3.5 35B-A3B

Alibaba

Sparse MoE Qwen 3.5 with 35B total / 3B active parameters. 256 experts with 8+1 active per token. Efficient inference with quality approaching much larger models.

Text Generation Local Qwen Family v3.5

Documentation Back to Models

Parameters

35B

params

Context Window

262K

tokens

Max Output

66K

tokens

Input Price

-

per 1M tokens

Output Price

-

per 1M tokens

Qwen Family 14 models

The full Qwen line by generation — pricing and capabilities vary across the family.

3.6

3.6 35B-A3B Latest

Local inference, agentic workflows, coding

3.5

3.5 35B-A3B Current

Efficient inference, agentic tasks, edge deployment

262K

context

Feb 2026

Reasoning, coding, analysis, multimodal

AI agents, multilingual tasks, reasoning, multimodal

General-purpose, coding, reasoning, multilingual

3

Flagship Qwen 3, agent tasks

Advanced reasoning, agents

Hybrid reasoning, MoE architecture

2.5

Flagship Qwen 2.5, beats DeepSeek V3

Alibaba

QwQ 32B (Reasoning)

Deep reasoning, chain-of-thought

Multilingual, coding, math

Complex tasks, analysis

Professional applications

Code generation, debugging

Capabilities

👁️

Vision

⚡

Function Calling

📋

JSON Mode

🌊

Streaming

💬

System Prompt

🖥️

Code Execution

🔍

Web Search

🔌

MCP Support

Local Model Specs

Quantization

Q4_K_M

Architecture

Sparse MoE (256 experts, 8+1 active, 3B active)

Runtime

Ollama / llama.cpp

Disk Size

24 GB

Details

Release Date: February 24, 2026
Knowledge Cutoff: -
Source: Local
License: Apache 2.0
Model ID: qwen3.5-35b-a3b

Last updated: March 13, 2026