| by gpu tier | Q2_K 1.5 GB | Q4_K_M 2.0 GB | Q5_K_M 2.5 GB | Q6_K 3.0 GB | Q8_0 4.0 GB | FP16 8.0 GB |
|---|---|---|---|---|---|---|
| 8 GB tier rtx 3050 · 4060 | GREAT 130 t/s | GREAT 104 t/s | GREAT 86 t/s | GREAT 74 t/s | GREAT 57 t/s | |
| 12 GB tier rtx 3060 · 4070 | GREAT 193 t/s | GREAT 154 t/s | GREAT 128 t/s | GREAT 110 t/s | GREAT 85 t/s | GREAT 45 t/s |
| 16 GB tier rtx 4080 · 4060 ti 16g | GREAT 289 t/s | GREAT 231 t/s | GREAT 192 t/s | GREAT 164 t/s | GREAT 128 t/s | GREAT 67 t/s |
| 24 GB tier rtx 3090 · 4090 | GREAT 467 t/s | GREAT 373 t/s | GREAT 310 t/s | GREAT 266 t/s | GREAT 206 t/s | GREAT 109 t/s |
| 32 GB tier rtx 5090 · m3 max | GREAT 867 t/s | GREAT 692 t/s | GREAT 576 t/s | GREAT 493 t/s | GREAT 383 t/s | GREAT 202 t/s |
| 48 GB tier a6000 · m3 max 64 | GREAT 280 t/s | GREAT 223 t/s | GREAT 186 t/s | GREAT 159 t/s | GREAT 123 t/s | GREAT 65 t/s |
| 80 GB tier h100 · m3 ultra 128 | GREAT 949 t/s | GREAT 758 t/s | GREAT 630 t/s | GREAT 540 t/s | GREAT 419 t/s | GREAT 221 t/s |