Skip to main content

Model Optimization

1 article about Model Optimization.
Contributors: Simon Willison

Articles

Taalas serves Llama 3.1 8B at 17,000 tokens/second

Simon Willison · explanation · 20/02/2026