Two latency numbers matter: time-to-first-token (TTFT) and tokens-per-second afterwards. Bigger models are usually smarter but slower and pricier. Real systems mix models: a small fast model for ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results