| by gpu tier | Q2_K 3.0 GB | Q4_K_M 4.0 GB | Q5_K_M 5.0 GB | Q6_K 6.0 GB | Q8_0 8.0 GB | FP16 16.0 GB |
|---|---|---|---|---|---|---|
| 8 GB tier rtx 3050 · 4060 | GREAT 74 t/s | GREAT 57 t/s | GREAT 47 t/s | OK 40 t/s | ||
| 12 GB tier rtx 3060 · 4070 | GREAT 110 t/s | GREAT 85 t/s | GREAT 70 t/s | GREAT 59 t/s | GREAT 45 t/s | |
| 16 GB tier rtx 4080 · 4060 ti 16g | GREAT 164 t/s | GREAT 128 t/s | GREAT 104 t/s | GREAT 88 t/s | GREAT 67 t/s | |
| 24 GB tier rtx 3090 · 4090 | GREAT 266 t/s | GREAT 206 t/s | GREAT 169 t/s | GREAT 143 t/s | GREAT 109 t/s | GREAT 56 t/s |
| 32 GB tier rtx 5090 · m3 max | GREAT 493 t/s | GREAT 383 t/s | GREAT 313 t/s | GREAT 265 t/s | GREAT 202 t/s | GREAT 104 t/s |
| 48 GB tier a6000 · m3 max 64 | GREAT 159 t/s | GREAT 123 t/s | GREAT 101 t/s | GREAT 85 t/s | GREAT 65 t/s | GREAT 34 t/s |
| 80 GB tier h100 · m3 ultra 128 | GREAT 540 t/s | GREAT 419 t/s | GREAT 343 t/s | GREAT 290 t/s | GREAT 221 t/s | GREAT 114 t/s |