what/llm/can/i/run
models/qwen-qwen2.5-coder-32b-instruct

Qwen2.5 Coder 32B

Sources aider, arena, livebench · Confidence A
params
32B
dense
bench sources
3
high conf
context
max tokens
license
Apache 2.0
permissive

Model meta

canonical
Qwen/Qwen2.5-Coder-32B-Instruct
parameters
32B
organization
Alibaba
license
Apache 2.0
context
downloads

Benchmark breakdown

LiveBench
instruction following64.9
coding57.0
language24.6
overall48.8
Aider · whole
polyglot pass rate16.4$0.00 run

Hardware fit matrix

by gpu tierQ2_K
12.0 GB
Q4_K_M
16.0 GB
Q5_K_M
20.0 GB
Q6_K
24.0 GB
Q8_0
32.0 GB
FP16
64.0 GB
G8 GB tier
rtx 3050 · 4060
G12 GB tier
rtx 3060 · 4070
G16 GB tier
rtx 4080 · 4060 ti 16g
GREAT
46 t/s
G24 GB tier
rtx 3090 · 4090
GREAT
74 t/s
GREAT
56 t/s
OK
45 t/s
G32 GB tier
rtx 5090 · m3 max
GREAT
137 t/s
GREAT
104 t/s
GREAT
84 t/s
GREAT
70 t/s
G48 GB tier
a6000 · m3 max 64
GREAT
44 t/s
GREAT
34 t/s
GREAT
27 t/s
GREAT
23 t/s
GREAT
17 t/s
G80 GB tier
h100 · m3 ultra 128
GREAT
150 t/s
GREAT
114 t/s
GREAT
92 t/s
GREAT
77 t/s
GREAT
58 t/s
GREAT
29 t/s
full fit · production speed tight fit · usabledoesn't fit
Get personalized ranking
Tell us your machine — we'll tell you if this is actually your best pick, or what's better.
Rank for my hardware