models/qwen-qwen3-30b-a3b-instruct

Qwen3 30B A3B 2507

Sources arena · Confidence C

params

30B

dense

bench sources

single

context

—

max tokens

license

Apache 2.0

permissive

▸Model meta

▸Benchmark breakdown

▸Hardware fit matrix

by gpu tier	Q2_K 11.3 GB	Q4_K_M 15.0 GB	Q5_K_M 18.8 GB	Q6_K 22.5 GB	Q8_0 30.0 GB	FP16 60.0 GB
8 GB tier rtx 3050 · 4060
12 GB tier rtx 3060 · 4070
16 GB tier rtx 4080 · 4060 ti 16g	GREAT 49 t/s
24 GB tier rtx 3090 · 4090	GREAT 79 t/s	GREAT 60 t/s	GREAT 48 t/s
32 GB tier rtx 5090 · m3 max	GREAT 146 t/s	GREAT 111 t/s	GREAT 89 t/s	GREAT 75 t/s
48 GB tier a6000 · m3 max 64	GREAT 47 t/s	GREAT 36 t/s	GREAT 29 t/s	GREAT 24 t/s	GREAT 18 t/s
80 GB tier h100 · m3 ultra 128	GREAT 160 t/s	GREAT 121 t/s	GREAT 98 t/s	GREAT 82 t/s	GREAT 62 t/s	GREAT 31 t/s

full fit · production speed tight fit · usabledoesn't fit

Get personalized ranking

Tell us your machine — we'll tell you if this is actually your best pick, or what's better.