|
Hello Reader, Every CEO in the last two years stood on a stage or got on an earnings call and said some version of the same thing: "AI is going to transform our operations. We are reducing headcount because AI will handle it." Two stories broke this week that every Solutions Architect and cloud professional needs to understand. Starbucks quietly shut down their AI inventory system after nine months. Deleted the blog post announcing it. No press release. Just an internal memo: go back to counting by hand. The reason they gave was accuracy errors. And Microsoft cancelled Claude Code licenses for their Experiences and Devices division, effective June 30, 2026, barely six months after the pilot launched. Two different companies. Same underlying story. The official reasons are half-truths.Starbucks blamed accuracy. But we already knew this. Gen AI is non-deterministic, it can't predict answer with 100% accuracy, nothings's changed there. And enterprise software makes mistakes all the time. You fix it, test, deploy. Companies do not spend nine months deploying something across 11,000 locations and then walk away over accuracy issues alone. There is a cost story buried here that nobody is saying out loud. The Microsoft story is even more revealing because the cost numbers are visible. Claude Code launched inside Microsoft's Experiences and Devices group in December 2025. Six months later, the experiment ended. The tool was not cancelled because engineers disliked it. It was cancelled because they used it too much. Token-based pricing compounds fast: at the individual developer level, the numbers are manageable, but at the scale of thousands of engineers using the tool daily, they accumulate into a line item that surprises finance teams. For comparison, Uber exhausted its $3.4 billion 2026 AI budget in four months, with 5,000 engineers deployed and per-engineer costs running between $500 and $2,000 per month. The ROI math at scale is brutal, and most organizations are not prepared for it.CEOs painted themselves into a corner.So why are the CEOs not coming out in droves and saying the same thing? And why did the Starbucks CEO not say cost? Here is the trap they are in now: they told Wall Street, they told your board, they told the public that AI was replacing workers. They used it to justify layoffs and restructuring. You cannot now issue a press release saying "actually, the AI costs more than the people did." That press release does not get written. So instead you say it made errors. You cite tool consolidation. This is a pattern, not isolated incidents.McDonald's killed their IBM drive-thru AI. Taco Bell slowed their rollout. A Pizza Hut franchisee reported their AI ordering system cost them significant sales. These are not random failures playing out in parallel. They are the same math problem hitting different organizations at different times. Here is the nuance your audience actually needs to understand. You cannot use Gen-AI everywhere. Now you need to evaluate if Gen-AI is the correct solution for this. What this means for your work right now: When a stakeholder points to the Starbucks story and asks you whether your team's AI initiative will suffer the same fate, you need a crisp answer. Start by identifying whether your use case can use deterministic traditional workflows, or really require Gen-AI. Then build an actual cost model before the pilot, not after. Token-based billing at scale needs to be forecasted the same way compute costs are forecasted. If your organization cannot predict what 5,000 engineers will spend on AI tooling in a month, you do not have a cost problem yet. You have a visibility problem that will become a cost problem. The uncomfortable truth is that AI is genuinely useful and genuinely expensive, and right now a lot of companies are stuck between the narrative they sold to the market and the reality showing up in their operating costs. They are not going to admit that publicly. So they will blame accuracy, blame the vendor, quietly roll things back, and then perhaps hire more people. Job data shows that number of open jobs was highest last month (April). I predict that the trend will continue. Your job as a practitioner is to see through that. Understand what AI actually costs to operate at scale. Understand where it works and where the economics do not support it. The people who figure that out are the ones building things that last and the ones companies trust to lead the next initiative. To get ahead, learn Gen AI cost optimization. Not the basic ones, but the intermediate and advanced ones. I will share this in the next edition. Stay tuned. Keep learning and keep rocking 🚀, Raj P.S. If you have found this newsletter helpful, and want to support me 🙏: Checkout my bestselling courses on AWS, System Design, Kubernetes, DevOps, and more: Max discounted links​ Checkout my YouTube channel for Cloud Gen AI tutorial and interview prep videos: Here​ AWS SA Bootcamp with Live Classes, Mock Interviews, Hands-On, Resume Improvement and more: https://www.sabootcamp.com/​ |
Free Cloud Interview Guide to crush your next interview. Plus, real-world answers for cloud interviews, and system design from a top AWS Solutions Architect.
Hello Reader, Most AI agents built today have a fundamental flaw. They forget everything the moment a session ends. You tell the agent your preferences, your constraints, your context. You close the tab. You come back. It has no idea who you are. This is not a bug. It is the default state of every LLM and agent. They are stateless by design. And if you are building agents or going into SA interviews, understanding how memory works at a system design level is now a baseline expectation. Why...
Hello Reader, GenAI is expensive. Most teams find out how expensive after the bill arrives. The overspend is not random. It comes from the same mistakes made across almost every GenAI project, and most of them are easy to fix once you know where to look. This is a popular interview topic. But when asked "How will you cost optimize Gen AI workflow and application?", some of the average answers I hear is: I will optimize prompts I will use cheaper models I will reduce usage Why are they...
Hello Reader, Cloud With Raj is expanding, and looking to hire 4th fulltime position: Customer Success Manager Job Description Title: Customer Success Manager Responsibilities: Being the point of contact and managing 80-100 clients for the 12-week duration of the program. Host 1-on-1 calls with the clients every 2 weeks. Proactively guide clients through activation points to get results. Our customers’ success is our biggest success. Suggest necessary changes to the client fulfillment process...