On token usage

I’ve been catching up on AI engineering talks from all the conferences that are happening as we speak. Thanks, by the way, for posting all the talks online for free. Last year we went collectively from chats with LLMs to semi-autonomous agents. This year the push seems to be “let your agent loose”.
Claude is pushing for looped multi-agents environments. They’ve also introduced a configurable bypassing of human permissions - which is actually a great idea. Up until now we’ve had 2 modes - either lots of interruptions and waiting for a human to approve this ls command, or a total anarchy of complete freedom for an agent to run whatever it wants.
Autonomous multi-agents will come at a cost though. While the efficiency and the outcome are not guaranteed, what is for sure is increased consumption of tokens. This is thinly veiled, and the implied counter-argument is that agents will work so much more efficiently and will build you stuff no other human would. Considering we are all professionals in this room, the bill for the tokens is footed by our likely for-profit driven company. Generally when paying bills that keep on increasing, the company will start asking if there are indeed returns on this investment.
This debate is not new. Back when computers were a novelty, you had to justify to your employer that indeed this new shiny and a very expensive machine will make you so much more productive, compared to your calculator. Or a typewriter, depending on your role. And before that you had to justify a purchase of an electronic calculator, and how it is an improvement over a logarithm ruler. Remember those?
Token budgets feel similar. You spend money, and get work done in return. Bigger token budgets = faster deliveries, crushing your competition, dominating the world. At least, in theory.
What’s different is the spending model. When upgrading from a calculator to a new IBM PC - granted that thing is expensive, but it is largely a one time cost. Unlike IBM mainframes that required constant maintenance, PCs were reliable, and quite modular in terms of failure. Rarely would a PC go up in smoke, and replacing a faulty part was very straightforward. So yeah, mostly a one-time cost.
With AI though it is a subscription. The moment you stop paying for tokens, everything stops. In a hypothetical scenario when you’ve built a highly complex product, all by yourself - you are also now a hostage of AI driven expertise. The cost of tokens has already started increasing, and will not be getting cheaper any time soon.
This reminds me of times of dial-up internet. We’ve paid by minute to access the Web. And before that you’d pay for a remote access to big mainframes during a dedicated time slot.
The big bet by AI providers today is that the usefulness of AI will outweight the token cost, both short and long term. Best case scenario you don’t need humans writing code or tinkering with systems. You formulate a task, and AI manifests it into reality, at an affordable cost. Worst case scenario - we migrate knowledge to the autonomous AI agents, token cost goes up, human expertise generally degrades, and then AI goes rogue.
There’s probably a middle road too. Meanwhile efficient token usage and limited budgets will likely become a thing. Hopefully more conference talks on this topic.