Why Large Language Models will become a commodity | Page Carbajal

And why that’s good

For the last five years, AI has moved so blazingly fast that every new model felt like a plot twist. GPT-3, GPT-4, Claude, Gemini — each one arrived with bigger numbers, bigger claims, and bigger expectations. It was like the JavaScript arena 10 years ago, with a new JS framework being released every couple of weeks.

Alas, technological progress never grows in a straight line. It grows in S-curves: explosive acceleration, followed by a long flattening. LLMs — specifically, dense, transformer-based, text-trained models — are now entering that flatline stage of progress.

Not because AI is slowing down — far from it. The current slump most likely comes from architectural limits for this AI generation, not the field itself. Let's continue with the JS parallel approximation. In JS, performance breakthroughs didn’t come from changing JS; they came in the form of new runtimes. Blazingly fast runtimes:

Bun implementing parts of the runtime in Zig
Vite unlocked speed by rewriting its core in Rust
and TypeScript got 10x performance gains by moving to Go

I would venture to say the future AI gains will come not in the form of making transformers bigger, but from new architectures that redefine how models reason, plan, and execute.

The real meaning of “plateau”

People imagine a plateau as a wall, but the science paints a different picture. Scaling-law research shows that model improvement follows “a smooth power law that approaches an irreducible minimum” (~~Epoch AI~~). Translation:

Early scaling produces huge jumps.
Later scaling produces tiny gains.
Each improvement costs dramatically more compute, data, and energy.

⠀This is the plateau: diminishing returns long before we hit a true ceiling. And we are now inside that phase.

Data exhaustion: the fuel source is finite

Every LLM is trained on human-generated text — books, articles, code, conversations. But we are running out.

As ~~MixFlow~~ explains: “the stock of high-quality human-generated text data is finite and projected to be exhausted in the coming years.” Without fresh, high-quality text, models hit a natural boundary. You can pour more compute into training, but if the dataset is tapped out, you stop getting breakthrough leaps.

Architectural limits: transformers can predict, but not understand

Transformers are incredible pattern recognizers — but they do not possess grounding, causality, or real reasoning.

This is why hallucinations persist at every scale.

Researchers describe this stage as the plateau of “non-reasoning LLMs,” noting that “progress on text-only tasks is flattening as models converge toward GPT-4-level performance on many public benchmarks” (~~Leena AI~~).

You can scale prediction. You cannot brute-force understanding.

Compute, cost, and the AI brick wall

The next leap in LLM performance would require:

Vastly larger clusters of GPUs — or TPUs
Tons of electricity
More expensive training runs
Massive operational budgets

~~SemiAnalysis~~ calls this the AI brick wall — the point at which scaling dense transformers becomes economically irrational. This isn’t a theoretical problem. It’s a financial one.

Are we already seeing the slowdown?

Yes — and the evidence is everywhere.

Benchmark improvements between model generations have shrunk into single-digit percentages. Analysts describe this as “convergence toward a performance ceiling” as improvements on MMLU, GSM8K, and HumanEval flatten out (~~Gary Marcus~~). In plain English: GPT-4-level performance is becoming the default across the industry. The gaps between models are now refinements, not revolutions.

Why this is not the end of AI

A plateau in transformers does not mean a plateau in AI. Scaling-law updates like Chinchilla show we still have headroom if data and compute become more efficient (~~LifeArchitect~~).

But more importantly, the breakthroughs ahead will come from new capabilities, not bigger text models:

Multimodality
Long-term memory and retrieval systems
Reasoning modules
Tool-using agents
Hybrid symbolic–neural systems

The next era will not be “GPT-5 but bigger.” It will be GPT-5 but smarter.

Why LLMs will become a commodity

When a technology stops delivering exponential returns, it stabilizes. It becomes predictable. Standardized. Ubiquitous.

That’s when it becomes a commodity.

And commoditization is not a downgrade — it’s the foundation for an explosion of innovation.

Costs drop dramatically

Once models stabilize, they become cheaper to run, cheaper to host, and cheaper to self-deploy. This unlocks on-prem LLMs for:

Banks
Hospitals
Governments
Enterprises that cannot send data to the cloud

Value moves up the stack

When the model becomes a commodity, the real differentiation shifts to:

Agents
Workflows
Orchestration
Fine-tuning
Integrations
Reasoning layers
UX

This is where actual business value lives.

Competition increases

Vendors can no longer rely on model size as a differentiator. They compete on:

Speed
Price
Privacy
Reliability
Developer experience

Healthy markets lead to better tools, which lead to better outcomes.

The big idea: commoditization is a win

LLMs are not the final form of AI. They are the steam engines of this era — powerful, transformative, but destined to become infrastructure.

They will plateau.
They will stabilize.
They will commoditize.

And that is good.

Because commodities are building blocks. The breakthrough innovations of the 2030s will not come from scaling prediction models.

They will come from people who know how to:

Orchestrate models
Build agents
Encode workflows
Combine reasoning with tools
Build real systems that solve real problems

The future doesn’t belong to bigger models. The future belongs to those who know how to use them well.