| by gpu tier | Q2_K 0.8 GB | Q4_K_M 1.0 GB | Q5_K_M 1.3 GB | Q6_K 1.5 GB | Q8_0 2.0 GB | FP16 4.0 GB |
|---|---|---|---|---|---|---|
| 8 GB tier rtx 3050 · 4060 | GREAT 210 t/s | GREAT 174 t/s | GREAT 149 t/s | GREAT 130 t/s | GREAT 104 t/s | GREAT 57 t/s |
| 12 GB tier rtx 3060 · 4070 | GREAT 311 t/s | GREAT 258 t/s | GREAT 221 t/s | GREAT 193 t/s | GREAT 154 t/s | GREAT 85 t/s |
| 16 GB tier rtx 4080 · 4060 ti 16g | GREAT 466 t/s | GREAT 387 t/s | GREAT 331 t/s | GREAT 289 t/s | GREAT 231 t/s | GREAT 128 t/s |
| 24 GB tier rtx 3090 · 4090 | GREAT 753 t/s | GREAT 626 t/s | GREAT 535 t/s | GREAT 467 t/s | GREAT 373 t/s | GREAT 206 t/s |
| 32 GB tier rtx 5090 · m3 max | GREAT 1398 t/s | GREAT 1161 t/s | GREAT 993 t/s | GREAT 867 t/s | GREAT 692 t/s | GREAT 383 t/s |
| 48 GB tier a6000 · m3 max 64 | GREAT 450 t/s | GREAT 374 t/s | GREAT 320 t/s | GREAT 280 t/s | GREAT 223 t/s | GREAT 123 t/s |
| 80 GB tier h100 · m3 ultra 128 | GREAT 1530 t/s | GREAT 1271 t/s | GREAT 1087 t/s | GREAT 949 t/s | GREAT 758 t/s | GREAT 419 t/s |