Towards AIblog

Beyond the Prompt: Why Autonomous AI Agents Are Replacing the Chatbot

Monday, June 8, 2026Suchit MajumdarView original
Last Updated on June 8, 2026 by Editorial Team Author(s): Suchit Majumdar Originally published on Towards AI. Beyond the Prompt: Why Autonomous AI Agents Are Replacing the Chatbot In May 2025, Sebastian Siemiatkowski — the same Klarna CEO who fifteen months earlier had told the world that one OpenAI-powered assistant was doing the work of 700 customer service agents — quietly started hiring humans back. Bloomberg got the quote: “Cost unfortunately seems to have been a too predominant evaluation factor, what you end up having is lower quality.” Headcount over the same window went from 5,527 at the end of 2022 to 3,422 at the end of 2024, per the S-1 Klarna filed in November. The chatbot stayed. The “all-AI customer service” story did not. So the title of this piece is half a lie, and I want to correct it before you read another paragraph. Chatbots are not, in any general sense, being replaced by autonomous agents in 2026. The replacement is happening in one specific place: queue-shaped back-office work where no human is waiting on the other end, and almost nowhere else. That narrow claim is the thesis. The broad version is what every vendor deck says, and it is wrong. If you walked out of your last AI strategy review thinking the agent wave is about to subsume your support org, your sales org, and your engineering org all at once, you are about to spend the next four quarters defending a budget against numbers that will not arrive. That is the claim. The rest is me showing my work. Klarna is evidence for the thesis, in reverse The 2024 Klarna press release is worth re-reading with an engineer’s eye. 2.3 million conversations in month one across 35 languages. Resolution time from 11 minutes down to 2. A CSAT of 4.4 against a human baseline of 4.2, Klarna’s own number, never independently audited. OpenAI mirrored the case study on its own site. It was the most widely cited “AI replaced humans” deployment of the LLM era. It was also a chatbot. Not an agent. A user-initiated, real-time, conversational interface with safety rails and a handoff-to-human button. Gergely Orosz pointed this out at the time in his Pragmatic Engineer breakdown: what Klarna had actually built was L1 tier-one support automation, the kind of containment work IVR systems were doing twenty years ago, except now in natural language. The bot was a filter that escalated anything sharp. Then it broke on the seams chatbots always break on. The May 2025 reporting from CX Dive and CNBC converges on a single picture: hallucinations clustered on edge cases. CSAT cratered on emotional tickets where the bot was technically correct but tonally wrong, because being right and being heard are different jobs. Compliance teams refused to let an LLM autonomously close accounts. So Klarna kept the bot for volume and rebuilt the human layer underneath it, “Uber-style,” remote and flexible, hiring students and rural workers as on-demand specialists. Read that as a bull case for chatbots if you want. I read it as a warning about the entire customer-facing slice. The most aggressive chatbot deployment in the world, with founder-level air cover and a workforce reduction of nearly 2,000 people, still bounced off the part of the work where a customer was on the line and cared about being there. That isn’t a story about agents replacing chatbots. It’s a story about customer-facing conversation being a category that resists full automation by either shape of system. The spine of the argument: the meaningful axis isn’t conversational versus autonomous, it’s who triggers the work. Source: builder spec compiled from Klarna S-1, Intercom Fin published metrics, Lemonade 10-K (Q4 2024). Where the chatbot still wins, and it isn’t close Intercom Fin is the cleanest counter to the “agents will eat customer support” narrative. Self-reported resolution rate of 67% globally as of late 2025, on 40 million cumulative conversations, across more than 10,000 business accounts. Priced at $0.99 per resolved conversation. Intercom claims the human-agent comparison is $5 to $10 per query and I’ll flag that as a vendor-published number, not an audit — but Teneo’s 2025 cost analysis lands in roughly the same range ($8–$15 per fully-loaded human resolution), so the order of magnitude is real even if Intercom is choosing the friendly end. The caveats matter. “Resolution” is defined by Intercom: the customer exits, or affirms satisfaction, after Fin’s last answer. No public study correlates that signal with actual customer satisfaction. And the variance across accounts is enormous. One Intercom community thread in late 2025 had a customer reporting 27.6% resolution rate next to another at 80.1% over the same 12-week window, with the high performers being the ones who spent two to four weeks cleaning their knowledge base before launch. The published 67% is a marketing mean sitting on a long, ugly tail. But the unit economics survive every caveat. This is a working chatbot business, at scale, on user-initiated conversational work, with no agent loop in sight. If your Q3 roadmap involves wrapping Fin in a LangGraph orchestrator and rebranding it an “agentic support platform,” the question I would ask in your planning meeting is whether the additional dollars per resolution clear the additional tokens per resolution, because the LeanOps numbers I’ll get to below say they usually don’t. There’s also the Air Canada precedent from February 2024, when the BC Civil Resolution Tribunal made the airline liable for its chatbot’s incorrect bereavement-fare advice. The damages were small, roughly $650 CAD. The precedent is not. Any system, conversational or autonomous, that makes binding statements to a customer creates legal exposure, which is one more structural reason the production migration is happening where no customer sits on the other end of the conversation at all. What actually has to be true for an agent to pay for itself Strip away the framework news cycle. OpenAI Agents SDK in March 2025. Google ADK in April. LangGraph 1.0 in October. Anthropic computer use […]