OpenAI’s new GPT-5.3-Codex-Spark model is a bit of a departure for the company’s family of Codex software development models: its focus is squarely on reducing latency.
Powered by Cerebras’ 125-petaflop Wafer Scale Engine 3, the Codex Spark model is meant for use cases where latency matters as much — or more — than intelligence. And fast it is: Codex Spark can deliver more than 1,000 tokens per second.
When OpenAI launched GPT-5.3-Codex only a few days ago, it highlighted how the team was able to bring down latency by 25 percent….

![[CITYPNG.COM]White Google Play PlayStore Logo – 1500×1500](https://startupnews.fyi/wp-content/uploads/2025/08/CITYPNG.COMWhite-Google-Play-PlayStore-Logo-1500x1500-1-630x630.png)