Introducing GPT‑5.3‑Codex‑Spark
Introducing GPT‑5.3‑Codex‑Spark (https://openai.com/index/introducing-gpt-5-3-codex-spark/)
OpenAI announced a partnership with Cerebras on January 14th (https://openai.com/index/cerebras-partnership/). Four weeks later they’re already launching the first integration, “an ultra-fast model for real-time coding in Codex”.
Despite being named GPT-5.3-Codex-Spark it’s not purely an accelerated alternative to GPT-5.3-Codex - the blog post calls it “a smaller version of GPT‑5.3-Codex” and clarifies that “at launch, Codex-Spark has a 128k context window and is text-only.”
I had some preview access to this model and I can confirm that it’s significantly faster than their other models.
Here’s what that speed looks like running in Codex CLI:
That was the “Generate an SVG of a pelican riding a bicycle” prompt - here’s the rendered result:
Compare that to the speed of regular GPT-5.3 Codex medium:
Significantly slower, but the pelican is a lot better:
What’s interesting about this model isn’t the quality though, it’s the speed. When a model responds this fast you can stay in flow state and iterate with the model much more productively.
I showed a demo of Cerebras running Llama 3.1 70 B at 2,000 tokens/second against Val Town back in October 2024 (https://simonwillison.net/2024/Oct/31/cerebras-coder/). OpenAI claim 1,000 tokens/second for their new model, and I expect it will prove to be a ferociously useful partner for hands-on iterative coding sessions.
Tags: ai (https://simonwillison.net/tags/ai), openai (https://simonwillison.net/tags/openai), generative-ai (https://simonwillison.net/tags/generative-ai), llms (https://simonwillison.net/tags/llms), cerebras (https://simonwillison.net/tags/cerebras), pelican-riding-a-bicycle (https://simonwillison.net/tags/pelican-riding-a-bicycle), llm-release (https://simonwillison.net/tags/llm-release), codex-cli (https://simonwillison.net/tags/codex-cli)