Laptops & Gear

OpenAI reduces GPT-5.4 for speed and low cost


OpenAI leverages its latest models to achieve unique targeting, fast responses and very low costs. The new GPT-5.4 mini and nano are designed for developers who care more about responsiveness than squeezing every last bit of processing power.

Both models are available from today. The GPT-5.4 mini runs more than twice as fast as its predecessor while sitting close to the full GPT-5.4 in key benchmarks. GPT-5.4 nano takes that a step further, focusing on simple tasks like sorting and extracting data where efficiency is paramount.

This method fits applications where speed shapes the experience. Coding assistants, back-end agents, and real-time visualization tools depend on fast feedback, and in those cases a smaller model usually delivers the best result.

How much performance do you lose

The performance gap between the models is smaller than you might expect. The GPT-5.4 mini scores 54.4 percent in SWE-Bench Pro, compared to 57.7 percent for the full model. In OSWorld-Verified, the mini reaches 72.1 percent while the large version reaches 75 percent, keeping the difference strong in all tasks.

Costs come down a lot. GPT-5.4 mini is priced at $0.75 per billion input tokens and $4.50 per million output tokens, while nano comes in at $0.20 and $1.25. Both models support text and image input, tool use, task calling, and a context window of 400,000 tokens, so that the lower value does not take away from the main power.

In the Codex, the smaller model only uses 30 percent of the GPT-5.4 quota. That allows developers to shift the coding task to a cheaper layer while preserving the full model of the hard logic.

Where young models do the heavy lifting

OpenAI also pushes multi-model workflows. Instead of relying on a single system, developers can divide the work into all stages, matching a large planning model with smaller ones that handle functionality.

That setting shows how many actual applications already behave. One model can update the codebase or determine changes, while the other processes data support or iterative steps. The smaller model handles predictable work, while the larger one focuses on judgment and communication.

Early feedback suggests that this mix is ​​working. Hebbia CTO Aabhas Sharma reported that the GPT-5.4 mini matched or surpassed competing models in several tasks at a lower cost, and in some cases even delivered stronger end-to-end results than the full GPT-5.4.

What should be used and when

GPT-5.4 mini is now available for all API, Codex, and ChatGPT. Free and Go users can access it by using the Thinking option, while other users may see it as a fallback when they reach the limits of GPT-5.4 Thinking.

The nano model is currently limited to API, intended for high-volume teams where cost control is important. Both models are live today with full documentation available.

For developers building real-time AI features, the switch is clear. Smaller models are now capable enough to handle a large share of daily work, making choosing the right balance of speed, cost, and power a practical decision.

Back to top button