what/llm/can/i/run
models/google-gemma-2-2b-it

Gemma 2 2b

Sources arena · Confidence C
params
2B
dense
bench sources
1
single
context
max tokens
license
Gemma license
see source

Model meta

canonical
google/gemma-2-2b-it
parameters
2B
organization
Google
license
Gemma license
context
downloads

Benchmark breakdown

Hardware fit matrix

by gpu tierQ2_K
0.8 GB
Q4_K_M
1.0 GB
Q5_K_M
1.3 GB
Q6_K
1.5 GB
Q8_0
2.0 GB
FP16
4.0 GB
G8 GB tier
rtx 3050 · 4060
GREAT
210 t/s
GREAT
174 t/s
GREAT
149 t/s
GREAT
130 t/s
GREAT
104 t/s
GREAT
57 t/s
G12 GB tier
rtx 3060 · 4070
GREAT
311 t/s
GREAT
258 t/s
GREAT
221 t/s
GREAT
193 t/s
GREAT
154 t/s
GREAT
85 t/s
G16 GB tier
rtx 4080 · 4060 ti 16g
GREAT
466 t/s
GREAT
387 t/s
GREAT
331 t/s
GREAT
289 t/s
GREAT
231 t/s
GREAT
128 t/s
G24 GB tier
rtx 3090 · 4090
GREAT
753 t/s
GREAT
626 t/s
GREAT
535 t/s
GREAT
467 t/s
GREAT
373 t/s
GREAT
206 t/s
G32 GB tier
rtx 5090 · m3 max
GREAT
1398 t/s
GREAT
1161 t/s
GREAT
993 t/s
GREAT
867 t/s
GREAT
692 t/s
GREAT
383 t/s
G48 GB tier
a6000 · m3 max 64
GREAT
450 t/s
GREAT
374 t/s
GREAT
320 t/s
GREAT
280 t/s
GREAT
223 t/s
GREAT
123 t/s
G80 GB tier
h100 · m3 ultra 128
GREAT
1530 t/s
GREAT
1271 t/s
GREAT
1087 t/s
GREAT
949 t/s
GREAT
758 t/s
GREAT
419 t/s
full fit · production speed tight fit · usabledoesn't fit
Get personalized ranking
Tell us your machine — we'll tell you if this is actually your best pick, or what's better.
Rank for my hardware