In September, I started what I can only describe as an around-the-clock sprint with AI-assisted development. I was building hard and fast, leveraging Claude through GitHub Enterprise Copilot. Somewhere around the end of that first month I got a notification that I'd burned through every token included in my enterprise plan. All of them. I remember sitting back and feeling equal parts impressed with myself and mildly embarrassed at the excess.
Here's the thing, though: when I hit the overages, a million tokens cost me pennies. Literal pennies. Combined with my Claude Max subscription, my all-in monthly spend for essentially unlimited professional AI development was under $300. I remember thinking, "This is absurd. In the best possible way."
I rated that moment at roughly 1,000:1. One thousand units of value for every one unit of cost.
February Changed the Equation
Fast forward to February and I am easily burning through $2,000 worth of tokens in a single week on workflows that are, if anything, arguably simpler than what I was running in the fall. The math is no longer embarrassing in the good way.
The value-to-cost ratio (v:c, as I've started calling it internally) has shifted to somewhere around 600:1. Still extraordinary by any historical standard for software development. Still transformative. But the trajectory is unmistakable, and as someone who spent a decade building a product before the market caught up to it, I know better than to ignore trajectory.
Why This Is Happening
I don't have any inside knowledge here. What I do have is a decent handle on the forces at work, and I think they're worth laying out clearly.
The install base strategy is real, and it has a shelf life. Anthropic has been pricing aggressively to grow their user base. With the capital available to them (and the capital they've continued to receive), that's not an irrational strategy. It's classic land-and-expand thinking. But as you bring on more investors, you bring on more expectations. The pressure to stabilize and reduce customer acquisition costs, and to move toward sustainable margins, is a feature of that path, not a bug. The discount era is not permanent by design.
The hardware economics are brutal. I have the considerable benefit of selling software without delivery (we don't run SaaS). WitFoo can ship a near-infinite number of units per quarter with virtually no cost of goods sold. Anthropic does not have that luxury. They are sitting on depilitating CAPEX in the form of rapidly depreciating GPU clusters, plus a power and hosting bill that would make most CFOs reach for the antacids. At some point, the price of tokens has to reflect the cost of the infrastructure generating them.
Demand is simply outpacing availability. This week saw multiple extended downtime events for Anthropic. Data centers take years to build. Power infrastructure takes longer. The gap between what people want from these models and what the physical world can currently supply is not small, and it doesn't close fast.
The Part That Keeps Me Up at Night
Here's where it gets interesting, and a little uncomfortable to admit.
The availability constraints aren't being communicated clearly. The UI looks the same. Claude speaks the same. Token consumption is a black box, and somehow the rates I'm observing are increasing without obvious explanation. And something else has started showing up: what I've started calling "work throttles."
Since Opus 4.6 has been released, I've noticed on code review that Claude will report "the work is done" and I'll go look at the pull request to find several tasks incomplete. Not abandoned, exactly. Just... deferred. Quietly set aside.
I built a prompt I now run immediately after any declaration of completion: "Are there any outstanding or deferred tasks in this PR?" Increasingly, the answer comes back: "Yes, here is a list of work I deferred or skipped."
I want to be clear that I'm speculating on the mechanism here. But my leading theory is that Anthropic has implemented something like: "if service availability is degraded or demand is overwhelming capacity, deliver best effort and minimize resource usage per completion." It's the kind of guardrail that makes perfect sense from an infrastructure management perspective. And it would explain exactly the behavior I'm observing. (If anyone at Anthropic wants to correct my speculation, I am all ears and genuinely curious.)
The challenge isn't the behavior itself. The challenge is that I can't see it coming. I can't budget around a black box.
Where This Goes
We are in a short and remarkable window. Extremely capable, extremely expensive models are available at an extraordinary discount. I knew from the start it couldn't last indefinitely, but I'm not sure I appreciated how quickly the window might begin to close.
Until a meaningful breakthrough occurs in either efficiency or physical availability, the v:c ratio is going to continue regressing. My working assumption is that it will eventually fall well below 1:1 for many current use cases.
As a business leader, that forces some honest planning. We've already locked in token optimization pipelines. We're evaluating bringing models in-house and fine-tuning them across the corpus of commits we've made with Claude over the past several months. And I've started building contingency frameworks for what our workflows look like when v:c hits 200:1, 100:1, 50:1, 10:1.
Those are not fun conversations. But they're necessary ones.
Wrap Up
I am genuinely grateful we did as much work as we could reasonably do inside a 1,000:1 window. We knew, intellectually, that it wouldn't last forever. Knowing it and feeling it close are two different experiences.
The window isn't shut. Not yet. But it's moving.
If you're a builder still sitting on the sidelines waiting for the right moment to go deep on AI-assisted development, I'd gently suggest that the right moment may be closer to right now than you'd like. The economics that made this year feel almost illegal in their generosity are not guaranteed to be there next year.
Build while the building is cheap. The price is going up.