The market has spent three years pricing AI companies on the cost of building the technology. The companies operating inside it are now living against the cost of running it — and the two have little resemblance to one another.
A perspective from Open Doors Partners.
Training a frontier AI model is a capital expenditure in the most classical sense: finite, front-loaded, and at least in principle, diligenceable. A number can be attached to it. A timeline can be drawn. A board can approve it and consider it closed.
OpenAI spent approximately $150 million training GPT-4. Over the following two years, running that model cost the firm an estimated $2.3 billion. The multiplier between the cost of building and the cost of serving — fifteen times — is not an operational anomaly. It is the structural reality of deploying AI at scale, and it will not compress as usage grows. It compounds.
AI inference costs accumulate with every user, every query, every automated process that calls the model. Training is a line item. Inference is a permanent operating obligation that expands in proportion to the success of the product it underlies. Capital that has not separated these two cost structures in its underwriting is pricing half a business and calling it the whole.
The arithmetic is not theoretical. It is already playing out in the financials of the firms most visible in this market.
OpenAI generated approximately $3.7 billion in revenue in 2025. It lost an estimated $5 billion in the same period — spending $1.35 for every dollar earned. The losses are driven not primarily by headcount or R&D investment, but by inference: the per-query, per-user cost of serving the products that produce the revenue. The business is growing. The cost structure beneath it is growing faster.
The pattern holds across the sector. Perplexity spent 164% of its 2024 revenue on AI compute costs. Midjourney was paying $2.1 million a month on GPU infrastructure before migrating to a different architecture that reduced the bill by 65% — not because the product changed, but because sustaining the original cost structure had become untenable. GitHub Copilot moved to usage-based billing in June 2026, after the firm’s own leadership acknowledged that flat-rate subscriptions had absorbed inference costs that the original subscription model could not sustain. Agentic sessions, the product direction most of the sector is moving toward, are estimated to cost ten to fifty times more to serve than the subscription economics implied.
Goldman Sachs has reported that companies are breaching AI inference budgets by orders of magnitude, with inference costs in engineering environments alone approaching 10% of total headcount costs. Research from a Turing Award-winning computer scientist published in early 2026 identified inference cost — not model quality, not data access, not regulatory pressure — as the primary structural bottleneck preventing AI companies from reaching profitability.
These are not firms with bad unit economics in the conventional sense. They are firms whose economics are structured around a cost that expands with the success of the product, not in spite of it.
The inference problem is not only a challenge for companies that have already scaled. It is the primary reason most enterprises do not scale from pilot to production at all.
MIT Sloan research from 2025 found that 95% of generative AI pilots do not reach production. The model is rarely the failure point. The cost of serving the model at real usage volume is. IDC and Lenovo research found that cost overruns between pilot and production average 380%, with infrastructure limitations cited as the cause of 64% of scaling failures. The gap between what AI can do in a controlled environment and what it costs to do it at commercial volume is where most enterprise AI capital is currently being lost.
This is a pricing signal for private markets. The companies that have solved the production gap — that have found the architectural or procurement path to serving inference at viable margins — are structurally different from those still absorbing losses in pursuit of market share. The two categories look similar at the revenue line. They do not look similar in the cost structure beneath it.
Late-stage AI valuations have been, in large part, a function of training: the quality of the model, the size of the data advantage, the depth of the foundational research. These remain consequential. They are not the whole question.
The pricing discipline that matters now is inference. OpenAI, Google, and Meta are currently pricing inference below cost — a deliberate market-share decision that has created a false floor in the market for AI services. When capital discipline returns and that floor normalises upward, the companies able to sustain their cost structure will be separated from those that cannot. The distance between those two groups is already visible in the financials. It has not yet been fully absorbed into how private markets are pricing the category.
The frontier moved from training to inference some time ago. The capital behind it is still catching up.
What are AI inference costs? AI inference costs are the computing expenses incurred each time an AI model generates a response — processing a query, generating an image, running an autonomous task. Unlike training costs, which are a one-time capital expenditure to build the model, inference costs are operational and recur with every use. They scale directly with adoption: the more users a product attracts, the higher the inference bill.
Why do AI inference costs matter for private market valuation? Private market valuations of AI companies have largely reflected the cost and quality of building the underlying technology — training data, model architecture, research depth. Inference costs — the ongoing expense of running those models at scale — have been less fully reflected in how the market prices these companies. As companies move from growth to operational maturity, the gap between what it costs to build AI and what it costs to serve it becomes the defining factor in whether the economics are sustainable.
What is the difference between AI training cost and inference cost? Training cost is the capital expenditure required to build an AI model: computing time, data, engineering. It is finite and front-loaded. Inference cost is the operational expenditure of running the model in production: every query answered, every image generated, every automated process served. Training cost is incurred once. Inference cost compounds continuously with usage — and the relationship between the two is not linear. OpenAI’s GPT-4 cost approximately $150 million to train and an estimated $2.3 billion to run over the following two years.
Which AI companies are most exposed to inference cost pressure? Companies with high usage volume and flat-rate pricing structures face the sharpest inference cost exposure — the model grows faster than the revenue it generates. Consumer AI products with large user bases, coding assistants with heavy session usage, and agentic products that execute multi-step tasks are all structurally intensive on the inference side. The transition toward agentic AI — where models take sequences of actions rather than answering single queries — substantially increases the per-session compute cost, which is why multiple firms have revised their pricing structures in 2025 and 2026.
This article is published for informational purposes only and does not constitute an offer or solicitation to buy or sell any security. Nothing herein should be construed as investment advice.