How fast is 10 tokens per second really?

How fast is 10 tokens per second really? (https://mikeveerman.github.io/tokenspeed/) Neat little HTML app by Mike Veerman (source code here

How fast is 10 tokens per second really? (https://mikeveerman.github.io/tokenspeed/)

Neat little HTML app by Mike Veerman (source code here (https://github.com/MikeVeerman/tokenspeed/blob/master/index.html)) which simulates LLM token output speeds from 5/second to 800/second.

Useful if you see a model advertised as “30 tokens/second” and want to get a feel for what that actually looks like.

Via Hacker News (https://news.ycombinator.com/item?id=48174920)

Tags: ai (https://simonwillison.net/tags/ai), generative-ai (https://simonwillison.net/tags/generative-ai), llms (https://simonwillison.net/tags/llms)
Write a comment
No comments yet.