Lakshmi Narasimhan

Your Cloud Bill Is A Tax On Someone Else's Resume

Lakshmi Narasimhan — Fri, 24 Apr 2026 00:00:00 +0000

There’s an insurance company somewhere — real, working, profitable — with 100,000 monthly users and a peak concurrent load of about 5,000.

They spend high six figures a month on Kubernetes.

They employ twenty people to keep it running.

This story surfaced this week in the Hacker News thread on David Crawshaw’s cloud essay, and the comments section turned into a confessional. Engineer after engineer describing the same pattern: cluster adopted, cluster “optimized,” cloud spend doubled, incidents doubled, and somehow the only thing anyone can agree on is that they need to hire a platform engineer.

You don’t. You never did. Your entire application would run on a laptop.

The incentive nobody likes to say out loud

Here’s the quiet part: your DevOps team does not choose infrastructure based on what your application needs.

They choose it based on what their next job will pay for.

Kubernetes on a resume is worth more than Docker Compose on a resume. Terraform on a resume is worth more than “I SSH’d into the box.” Managed EKS on a resume is worth more than “I run a VM.” Every procurement decision in a modern engineering org is being made by someone who, at some level, is also writing the next page of their LinkedIn.

And management, god bless them, trusts the sales and marketing departments of Datadog and AWS and HashiCorp more than they trust their own engineers. So when someone internally says “we could do this on one server,” and someone externally sends a deck titledScaling Your Platform For The Future, guess which one wins the meeting.

The decision was never technical. You just paid the technical price for it.

Kubernetes is not the villain. The scale is.

Let’s be precise, because “Kubernetes” is doing a lot of work in this essay.

Full enterprise Kubernetes — managed control planes, service meshes, operators for everything, a dedicated platform team, Helm charts nested inside Helm charts like Russian dolls of YAML — that thing was built for Google’s problem. Multi-tenant, multi-region, thousands of services, teams that don’t talk to each other.

If your org does not look like that, you are wearing a costume.

K3s on a single VPS is not the same animal. Docker Compose on a single VPS is not the same animal. Kamal shipping containers to one Debian box is not the same animal. Those are orchestration for people who want one sane way to deploy a container, not a career in platform engineering.

The HN thread is full ofengineers who moved from full K8s to one of these simpler setups. The reports are boringly consistent: costs collapsed, incidents dropped, debugging became possible again. Nobody was shocked. Everyone had been waiting for permission to say it.

The solo founder’s version of this trap

You are not the insurance company. You do not have twenty people. You have you, and maybe a contractor, and a credit card that is getting nervous.

And yet — you will read the AWS Well-Architected Framework. You will follow a tutorial that starts with “first, let’s set up your VPC.” You will pay $80/month for a managed database to store 200 rows. You will provision a load balancer in front of one server. You will copy the shape of infrastructure you saw at your day job, because that shape felt legitimate, and you want to feel legitimate too.

This is how solo founders end up with a$600/month AWS bill for an app that has six users.

The shape of legitimacy is the trap. Nobody cares what your infrastructure looks like until you have customers, and once you have customers,“my app runs on one $12 VPS” is a story peoplelove. It’s the opposite of suspicious. It’s proof that the thing works.

What to actually do

One machine until you can’t. One VPS. One Postgres on that VPS. One reverse proxy. Docker Compose or Kamal to deploy. You are allowed to stop here for years.
Scale vertically first. Hetzner will rent you a 48-core EPYC machine with 256 GB of RAM for €199/month. A mid-tier managed Kubernetes cluster on AWS starts at more than that before you’ve run a single pod. Most apps die from bad unit economics, not from running out of CPU.
When you outgrow that — and you might not —K3s on a few boxes gives you orchestration without the org chart. This is the actual sweet spot for a solo operator who needs more than one machine but less than a platform team.
Treat every infrastructure recommendation as a resume artifact until proven otherwise. Ask who benefits if you adopt this. If the answer is “the person telling me to adopt it,” weigh accordingly.
Your cloud bill is a leading indicator of how much time you are spending on things that do not make your product better. Watch it like you watch your weight.

The cloud was supposed to be leverage. For most people, most of the time, it has become the opposite: a recurring invoice for someone else’s credibility.

You are allowed to just run the server.

]]>

The Real SaaS Moat AI Can't Replicate

Lakshmi Narasimhan — Mon, 09 Feb 2026 00:00:00 +0000

There’s a comment buried 14 levels deep inthis Hacker News thread about AI killing B2B SaaS. It has 37 upvotes and it’s the smartest thing I’ve read this year.

Here it is, paraphrased: “The real innovation of SaaS was laundering inaccessible open-source software into a format that doesn’t require transiting git. The hard part was never the code. The hard part was that git sucks.”

I laughed. Then I stopped laughing because it’s devastatingly correct.

The Git Laundering Machine

Think about the most profitable SaaS businesses in technology. Seriously, list them.

AWS? That’s Linux, KVM, and Xen behind a billing dashboard. Heroku was git-push-to-deploy because deploying was too hard. Vercel is the same thing for Next.js. MongoDB Atlas is MongoDB without the ops. Redis Cloud is Redis without the YAML. Supabase is Postgres without the DBA.

Every single one of them is a factory that converts something freely available on GitHub into something you can pay for on a website.

The commenter was right. These companies didn’t build moats with proprietary technology. They built moats by standing between users and git. Their value proposition, stripped to the studs, is: “You don’t have to clone a repo.”

That’s a $500 billion industry built on the fact thatgit clone is scary.

LLMs Just Killed the Middleman

Here’s where the “AI is killing SaaS” thesis gets real.

When a CTO says “can we build this internally?”, the old answer was: “Technically yes, but you’d need 3 engineers, 6 months, and ongoing maintenance. Just buy the SaaS.”

The new answer: “ChatGPT set it up in 20 minutes. It reads from the same open-source code the SaaS vendor uses. It runs on our infrastructure. There’s no monthly bill.”

LLMs do exactly what SaaS companies do — they take inaccessible open-source software and make it usable by normal humans. They just skip the subscription.

The git laundering machine now has competition. And the competitor works for free.

What Actually Survives

So is B2B SaaS dead? No. But the moat map just got redrawn.

Here’s what doesn’t survive:any SaaS whose primary value is “we set it up so you don’t have to.” Deployment wrappers, config GUIs, managed hosting for commodity databases — all of this is getting compressed.

An HN commenter who manages teams put it bluntly: “Management doesn’t want to be responsible for bespoke internal tools.” That’s real. But it’s a shrinking moat. Today’s management doesn’t want to be responsible. Tomorrow’s management grew up with ChatGPT and doesn’t see internal tooling as risky.

Here’s what survives:

Data. If your SaaS accumulates proprietary data over time — customer behavior patterns, industry benchmarks, network effects — that’s a moat AI can’t replicate. A new LLM-generated tool starts with zero data. Your SaaS has three years of it.

Compliance and trust. SOC 2, HIPAA, GDPR certification takes time and money. “ChatGPT built it” doesn’t pass an enterprise security audit. Yet.

Workflow lock-in. Not the software itself, but the habits. Slack isn’t hard to replace technically. It’s hard to replace because your whole company’s muscle memory lives there.

Network effects. Figma isn’t valuable because of the rendering engine. It’s valuable because your designers, developers, and product managers are all in the same file. That’s a moat no amount of vibe coding can replicate.

The specification itself. Here’s the contrarian take within the contrarian take: as code becomes commodity, the spec becomes the product. The companies that survive aren’t the ones that write the best code. They’re the ones that understand the problem deeply enough to specify what “right” looks like. Everyone else is just a GPT wrapper with a landing page.

The Indie SaaS Playbook Changes

If you’re building SaaS solo — and if you’re reading this newsletter, you probably are — the implications are brutal and clear.

Full disclosure: I built a product that does exactly this.Supabyoi deploys Supabase for you. By my own thesis, that’s a shrinking moat. I’m writing this post partly because I’m living the question: evolve or get compressed.

Stop building tools. Start building data flywheels.

A CRUD app with a nice UI is now a weekend project for anyone with ChatGPT. A system that gets smarter with every user interaction is still a real business.

Stop selling setup. Start selling ongoing value.

“We deploy Postgres for you” is dying. “We analyze your Postgres performance patterns across 10,000 databases and tell you what’s about to break” is thriving.

Stop competing on features. Start competing on understanding.

The SaaS products that survive AI commodification will be the ones that understand their customers’ problems better than a general-purpose LLM ever could. Domain expertise is the last moat.

The $500 Billion Question

The HN thread devolved into the usual “AI is overhyped” vs. “AI changes everything” tribal warfare. But that one comment, buried 14 levels deep, cut through all of it.

The SaaS moat was never the software. It was the fact that software was hard to access. That moat is evaporating.

What’s left is data, trust, network effects, and deep domain understanding.

Build your SaaS around those. Or enjoy competing with a free chatbot.

]]>

Software Engineering Is Dead, or Is It?

Lakshmi Narasimhan — Tue, 27 Jan 2026 00:00:00 +0000

Everyone said agentic coding would kill software engineering discipline. Turns out it killed thewrong disciplines.

Clean code?Dead. Nobody’s hand-crafting variable names when Claude generates 500 lines in 30 seconds. But TDD, specs-driven development, domain-driven design — the stuff we used to skip because it felt like ceremony? That’s the load-bearing wall now. Tear it out and the whole thing collapses.

TDD: The Cache That Wasn’t

I had Claude Code build me a Redis caching module. Proper TTLs. Cache invalidation on writes. Unit tests passing. Beautiful, elegant, chef’s-kiss code.

One problem. The actual query functions never called the caching layer.

Hundreds of requests later, I checked Redis. Empty. A pristine, untouched Redis instance, sitting there like a museum exhibit.I’ve written about these failure patterns before — this one hurt the most.

Integration tests would have caught it. But only if I’d written themfirst. That’s the part everyone skips — writing the verification before the implementation. TDD forces you to define “done” before the agent starts building. Without it, you get beautiful isolated components that nobody wired together.

This isn’t hypothetical. An r/programming thread (894 upvotes) nailed it: “We’re getting correct code, but not right code.” One reviewer found AI-generated Java using the default ForkJoinPool for I/O-bound tasks. Compiles fine. Passes unit tests. Catastrophic under load.

My favorite was the “chief architect” who generated “full coverage” unit tests with Copilot. Duplicate asserts. Unused service constructions. Tests that passed but tested nothing. A green CI pipeline that was essentially a participation trophy.

TDD isn’t ceremony anymore. It’s the spec your agent actually follows.

Specs-Driven Development: The Authentication Amnesia

I spent two weeks pair-programming authentication with Claude Code. We tracked race conditions together. Debated RS256 vs HS256. Built a shared understanding of every edge case.

Then compaction hit.

“Where did we leave off?”

“I don’t have information about previous sessions.”

Two weeks of context. Gone. MyTODO.md became a graveyard of cryptic notes that made sense to exactly nobody, including me three days later.I wrote the full horror story here.

So I started using a git-backed issue tracker with dependency graphs that persists across agent sessions. Sprints and epics stopped being PM ceremony and became the agent’s memory. The control plane for multi-session work.

The pattern scales beyond my personal disasters. An r/programming post titled “The era of AI slop cleanup has begun” (4,200 upvotes) described a freelancer who keeps getting hired to fix AI-generated codebases. “It mostly works, but does so terribly.” The missing ingredient every single time: no structured planning, no phased delivery. Just vibes and a prompt.

Fred Brooks said it decades ago, and r/ExperiencedDevs rediscovered it (1,400 upvotes): “Once requirements are fully expressed, their information content is fixed. You can change surface syntax, but you can’t compress semantics.”

You can’t skip the thinking. You can only skip writing it down — and then you pay for it later when your agent wakes up with amnesia.

DDD: The Firewall Agents Can’t Generate

Here’s aReddit thread that lives in my head rent-free. Someone described the “Phantom Author” problem — only domain experts catch the subtle flaws agents produce. The code compiles. The tests pass. The logic is plausible. But it’swrong in ways only someone who understands the domain would notice.

The punchline: “Ironically the only people who should be using AI are people who are already experts.”

Bounded contexts — the core DDD concept — are the firewall. They tell the agent where one domain ends and another begins. Without that modeling, agents connect everything to everything. Your billing module knows about your notification preferences. Your auth layer has opinions about your recommendation engine.

Agents can’t generate domain boundaries because domain boundaries come from understanding the business, not the code. That’s your job. The agent’s job is everything inside the boundary.

The Punchline

The disciplines that survived aren’t the ones that made code pretty. They’re the ones that tame complexity.

TDD tells the agent what “done” means. Specs give it memory across sessions. DDD gives it boundaries it can’t infer on its own.

We didn’t need less engineering discipline. We neededdifferent engineering discipline. The ceremony is dead. The structure is mandatory.

]]>

The AI Productivity Paradox: Why I'm Working More Than Ever

Lakshmi Narasimhan — Mon, 26 Jan 2026 00:00:00 +0000

I had a conversation with a friend last week that I can’t stop thinking about.

We were comparing notes on hitting usage limits with AI coding tools. Both of us on expensive plans. Both of us running into ceilings more often than we did months ago. Both of us, apparently, turning into “power users” in our respective tiers.

And then he dropped this line: “So AI was supposed to make us work less but now we are working more. That’s the conclusion.”

I laughed. Then I stopped laughing.

Because he’s right. I get more done in a single day than I used to accomplish in a week. I’m shipping features, writing content, running experiments at a pace that would’ve been unthinkable about a year ago.

And I have never worked this much in my life.

Here’s what nobody warned us about: AI didn’t give us more time. It gave us more capability.

And capability, it turns out, is extremely addictive.

The Collapse of Activation Energy

Before AI coding assistants, most ideas died a quiet death in my notes app. Not because they were bad ideas. Because the effort-to-value ratio was unfavorable.

“I could build that feature, but it would take a week of focused work. Is it worth a week? Probably not.”

Idea archived. Moving on.

Now that same feature takes a day. Sometimes less. So I build it.

Then I build the next thing. And the next. And suddenly I’m shipping more in a month than I used to ship in a quarter.

The activation energy for starting new work collapsed. And I filled every inch of the newly available space.

Ambition Scales With Output

Here’s the thing about humans: we don’t scope our ambitions in absolute terms. We scope them relative to what feels achievable.

Before AI, I planned projects based on what I could reasonably ship with my limited time and energy. A feature per week. Maybe two if I was focused.

Now “reasonable” means something entirely different. My mental model of what’s achievable expanded by 5x, and my project scope expanded right along with it.

I’m not doing the same work faster. I’m doingmore work.

The goalposts moved. And I moved them myself.

The Death of Natural Stopping Points

There used to be friction in development work. Waiting for builds. Context switching costs. The mental load of holding an entire system in your head while debugging.

That friction was annoying. It was also a circuit breaker.

It forced breaks. It created natural pauses where you’d step away, get coffee, maybe realize it was 7pm and you should probably eat dinner.

AI removed the friction. Which sounds great until you realize the friction was also your automatic brake pedal.

Now you can go from idea to implementation to deployment without ever hitting a natural stopping point. The only thing that stops you is your own willpower.

My willpower, for the record, is not great.

The Dopamine Loop of Shipping

Here’s an uncomfortable comparison: AI-assisted coding feels a lot like infinite scroll.

You ship something. It feels good. The tool makes shipping fast and easy. So you ship something else. That also feels good. And there’s always one more thing you could ship.

Same psychological mechanics. Different output.

Except instead of consuming content, you’re producing it. Which feels more virtuous. Which makes it even harder to stop.

“I’m not doomscrolling. I’m beingproductive.”

Sure you are.

The “Why Not” Threshold

The most insidious change is what happened to my internal cost-benefit calculator.

I used to ask: “Is this worth the effort?”

Now I ask: “Why wouldn’t I just do this?”

That experiment I would’ve skipped because setting it up was tedious? Now I run it. That edge case I would’ve ignored because fixing it properly would take half a day? Now I fix it.

The threshold for “worth my time” dropped to near zero. So everything is worth my time. So I do everything.

This is how you end up working 12-hour days while technically being more “efficient” than ever before.

The Uncomfortable Truth

AI tools didn’t give us more free time. They gave us more output capacity. And we’re psychologically incapable of leaving capacity unused. At least I am.

The work expanded to fill the available capability. Parkinson’s Law, but in reverse.

We’re not working less. We’re shipping more whilefeeling productive. Which is a different thing entirely.

My friend was right to put “off” in scare quotes when wishing me a good weekend. We both knew I wasn’t really taking time off. I was just switching to a different kind of work.

What Now?

I don’t have a tidy solution here. I’m not going to pretend I’ve figured out work-life balance in the age of AI assistants.

But I’ve started noticing when I’m filling capacity just because I can. When I’m starting a new feature not because it matters, but because the activation energy is so low that “why not” won the argument.

Sometimes the answer to “why not” is: because you could just… not.

Groundbreaking insight, I realize.

The AI isn’t going to set boundaries for you. If anything, hitting usage limits might be the only forced break some of us get. Which is both sad and a little funny.

Maybe the real productivity hack is learning to leave capability on the table.

I’ll let you know how that goes. Right after I ship this one more thing.

I write about building and deploying software as a solo developer. If you’re trying to do it all yourself without hiring a team, I’m probably making the same mistakes you are.

]]>

I Built 2 SaaS Products Vibe Coding. Here's the System That Made It Work.

Lakshmi Narasimhan — Sat, 24 Jan 2026 00:00:00 +0000

Gene Kim and Steve Yegge’sVibe Coding book says you’re the head chef now.

The metaphor runs through the whole thing: you’re not a line cook anymore, you’re orchestrating AI sous chefs, directing the kitchen, tasting every dish before it goes out. The developer-as-implementer era is over. Welcome to developer-as-orchestrator.

The Biryani Incident

It’s a good metaphor. I buy it. But here’s the thing about being a head chef that the metaphor doesn’t quite capture: a head chef without mise en place is just a guy having a panic attack near hot surfaces.

I know this because I’ve been that guy. Literally.

My wife had to leave town for a few days. “I’ll handle dinner,” I said, with the confidence of someone who has watched many cooking videos and successfully boiled pasta multiple times. I decided to make veg biryani — a dish my wife makes effortlessly, layering rice and vegetables and spices into something that tastes like it required more effort than it actually did.

“Prep everything first,” she told me before leaving. “Soak the basmati rice. Marinate the paneer. Chop the vegetables for layering. Have it all ready before you start cooking.”

Reader, I did not do this.

I started frying onions. While the onions were going, I realized I hadn’t marinated the paneer. So I started cubing paneer and mixing yogurt and spices. Then the onions started burning. I ran back, stirred frantically, ran back to the paneer. Remembered I needed to soak the basmati. Started the rice soaking. The onions were now definitely burned. I scraped them out, started over, but now I was behind, so I tried to do the vegetables and the new onions simultaneously while the paneer sat half-marinated…

An hour later I had a kitchen that looked like a crime scene, three pans with various stages of failure in them, and something that was technically edible but bore no resemblance to biryani. My wife, via video call, watched me plate this disaster with the expression of someone who had specifically warned against this exact outcome.

The problem wasn’t skill. I can cook. The problem was that prep and execution were bleeding into each other. I was trying to figure out what I needed while also doing the thing. And it turns out you can’t actually do both. Not well, anyway.

I’ve been that guy with AI sous chefs too.

I’ve been vibe coding since mid-2025. By “vibe coding” I mean the thing where you describe what you want in natural language and an AI writes the code. You know, the future we were promised, except the future has some sharp edges nobody mentioned in the demos.

Two SaaS products. Real users. Real revenue. Not toy projects, not “look ma I generated a todo app” tutorials, not the kind of thing you show off on Twitter and then quietly delete three weeks later. Actual products that people pay actual money for.

So when I tell you what follows, understand: this isn’t theory. This is what I learned by shipping real things and watching everything that could go wrong go wrong.

The Markdown Hemorrhage

For the first few months, I was that chef.

I’d sit down to implement a feature. Claude and I would get rolling. Then I’d notice a bug. Well, I’m already here, might as well fix the bug. Then while fixing the bug, I’d realize the error handling was inconsistent. Better clean that up. Oh, and there’s still context left in the window — might as well tackle that other feature I’ve been meaning to add.

Two hours later: three half-finished things, Claude confused about which task we’re actually doing, and code quality somewhere between “works” and “I’m not sure why.”

And the markdown. God, the markdown.

Claude, bless its heart, wanted to help me remember things. So it started creating files.ARCHITECTURE.md.DECISIONS.md. IMPLEMENTATION_NOTES.md.TODO.md.CONTEXT.md.CHANGELOG.md. README_UPDATED.md.

I call this markdown hemorrhage. The AI equivalent of a kitchen where every surface is covered with prep bowls, half-chopped vegetables, and sticky notes that say “DON’T FORGET THE SAUCE” — technically documentation, practically chaos.

At one point I had so many markdown files that I needed another AI tool just to search through the documentation I’d created for my AI tool.

This was clearly insane.

But here’s the thing that took me embarrassingly long to figure out: the problem wasn’t the tools. The problem was me.

One Goal Per Session

I was treating every Claude session like a buffet.

You know how it goes. You sit down to implement a feature. While you’re implementing, you notice a bug. Well, you’re already here, might as well fix the bug. Oh, and while fixing the bug, you realize the error handling is inconsistent across the codebase. Better clean that up too. And hey, there’s still context left in the window — might as well tackle that other feature you’ve been meaning to add.

Two hours later, you’ve got three half-finished things, Claude is confused about which task it’s actually working on, and the code quality has degraded to “works but I’m not sure why.”

I call this context pollution. And once I named it, I started seeing it everywhere.

LLMs are bad at juggling multiple goals. This isn’t a Claude problem — it’s a fundamental thing about how these models work. When you ask them to hold multiple objectives simultaneously, they get worse at all of them. Not a little worse.Dramatically worse.

The fix sounds almost stupidly simple: one goal per session.

That’s it. That’s the whole trick. One goal. One session. If you discover a bug while implementing a feature, you write down the bug and you close the session. The bug gets its own session later. No “while I’m here” detours. No context pollution.

“But what about efficiency?” I hear you asking. “Isn’t it wasteful to end a session when there’s still context left?”

This is the trap. This is exactly the thinking that leads to burned onions and half-marinated paneer. The leftover context is not an asset. It’s a liability. It’s your coworker with three tasks open, doing all of them poorly, about to forget everything anyway.

End the session. Start fresh. One goal.

The Mise en Place

Now, this discipline only works if you have a way to track what you’re not doing.

If you end a session every time you discover a bug, you need somewhere for that bug to live. Otherwise you’ll forget it. The bugs pile up in your head, you context-switch mentally, and you’re back where you started.

This is where beads comes in.

Beads is a git-backed issue tracker that Claude can read and write. Steve Yegge built it (yes, that Steve Yegge — the guy who wrote the platforms rant and approximately nine million words about Emacs). The idea is simple: every task becomes a “bead.” Claude creates them, updates them, closes them. They survive compaction. They sync through git.

I installed it. I ranbd init. And then something clicked.

See, beads isn’t just a todo list. It’s a forcing function. When you start a session, you runbd ready and it shows you what’s available to work on. You pickone. Not three. One.

And when you discover a bug mid-session? You tell Claude to create a bead for it. Claude writes it down, logs the context, notes any relevant details. Then you move on. The bug exists now. It has a home. You don’t have to hold it in your head.

The discipline and the tool reinforce each other. One bead per session only works because beads exist to capture everything else. And beads only work because the discipline prevents you from drowning in them.

Grooming vs. Coding

But I’m getting ahead of myself. Let me tell you about grooming.

In my old workflow, I’d sit down and just… start. Open Claude, describe what I wanted, begin coding. Very vibe. Very chaotic. Whatever felt right in the moment.

The problem is that “figuring out what to do” and “doing the thing” are completely different cognitive modes. One is divergent — you’re exploring possibilities, breaking down problems, identifying edge cases. The other is convergent — you’re executing, making decisions, writing code.

When you mix them, you get mush.

So now I run two types of sessions:

Grooming sessions are for thinking. I’m not coding. I’m not even planning to code in this session. I’m creating beads. Breaking down a feature into pieces. Identifying dependencies. Noting edge cases. If I think of an unrelated feature while grooming, it gets written down — for a different grooming session. No cross-contamination.

Coding sessions are for execution. One bead. Implement it. If I discover a bug, I note it and keep going unless it’s blocking. The bug gets groomed and coded in its own sessions later.

This separation is the whole game. It sounds bureaucratic. It sounds like exactly the kind of process that “vibe coding” was supposed to eliminate. But here’s the secret: this discipline is what makes vibe coding actually work at scale. Without it, you’re just generating code and hoping. With it, you’re building systems.

A Few Other Things

MCPs should be loaded at project level, not globally. Every MCP eats context. If a project doesn’t need the Reddit MCP, it doesn’t get the Reddit MCP. Context is expensive. Guard it like it’s money, because in a very real sense, it is.

Autocompact should be off. I want to control when context resets, not have the algorithm decide for me mid-feature. Yes, this means manually managing sessions. That’s the point.

Claude.md files are more powerful than you think. I have a global one in~/.claude/CLAUDE.md with rules that apply everywhere. Each project gets its own with project-specific instructions. Claude reads these automatically. They’re like a pre-prompt that doesn’t eat your context window.

What Still Doesn’t Work

Now, here’s the part where I’m supposed to tell you it’s all solved and my workflow is perfect.

It’s not.

Debugging production issues is still clunky. I’ve got a combination of skills and MCPs that sort of works, but there’s too much manual context assembly. Something breaks in prod and I’m still spending the first 20 minutes of the session explaining the architecture before we can even start diagnosing.

Test-driven development doesn’t flow. The loop of “write test, see it fail, implement, see it pass” — it’s awkward. Claude wants to write everything at once. I’m still tweaking my tooling to make TDD feel natural.

UX work is hard. Like, fundamentally hard. Claude can scaffold UI. It can generate components. But “does this feel right?” is a human judgment call, and trying to get there through text-based iteration is like describing a painting to someone and asking them to tell you if it’s beautiful.

These are the walls I’m hitting. I’m building tooling to address them — anagent orchestrator that tailors Claude to my specific workflow. Work in progress. If you’re the adventurous type, you cantry it now.

The System

So here’s the actual system, if you want to try it:

Install beads:npm install -g @anthropic-ai/beads && bd init
Add to your globalCLAUDE.md: “Checkbd ready at session start. One bead per session.”
Separate grooming from coding. Different sessions. Different mindsets.
Resist the urge to “do more while there’s context left.” That’s the trap.
Protect your context. Project-level MCPs only. Kill anything you don’t need.

Two SaaS products since mid-2025. All vibe coded with this system.

Not because the tools are magic. The tools are good, but tools are never magic. What made it work was the discipline — the willingness to be a little bit boring about context hygiene, to resist the temptation to do more, to trust that a focused session ships more than a scattered one.

Vibe coding without chaos. It turns out it’s not about vibing harder. It’s about vibing deliberately.

You’re the head chef now. But don’t forget your mise en place.

My wife was right, by the way. She usually is.

I’m Lakshmi. 20 years in software — ops, infrastructure, full-stack. Now solo founder using Claude Code to develop, deploy, and distribute.

]]>

Congratulations, You've Been Promoted to Code Janitor

Lakshmi Narasimhan — Fri, 23 Jan 2026 00:00:00 +0000

It was 2001. I was building a platformer.

Not “building” in the modern sense, where you describe what you want and a language model hallucinates it into existence. I meanbuilding. DJGPP. Allegro. A DOS compiler that ran on Windows 98 and made you feel like a wizard for getting it to work at all.

I spent three weeks figuring out how platform scrolling worked.

Three weeks. Not because I was stupid — though jury’s still out — but because nobody had written a Medium article explaining it. Stack Overflow didn’t exist. The Allegro documentation was a text file that assumed you already knew what a framebuffer was. I had tothink.

And then one night, around 2am, I got it working.

My little sprite — a 16x16 pixel abomination that was supposed to be a knight but looked more like a confused rectangle — walked across the screen. I pressed the arrow keys and the platform scrolled. The background moved. The character stayed centred.

I decided, right then, that I wanted to be a game programmer.

(I didn’t become a game programmer. Life had other plans. But that’s not the point.)

The point is: I remember that moment with perfect clarity. The dopamine hit. The sense ofcreation. I had figured something out. I had made something move. I understood, down to the register level, why it worked.

I couldn’t tell you the last time I felt that.

The Joy We Traded

There’s a thread on r/ClaudeAI that’s been haunting me. 624 upvotes. Title: “We are not developers anymore, we are reviewers.”

The author nails it:

“Coding used to be a creative act. You enter a ‘flow state,’ solving micro-problems and building something from nothing. Now, the workflow is: Prompt → Generate → Read Code → Fix Code. We have effectively turned the job into an endless Code Review session.”

And then the kicker:

“Let’s be honest, code review has always been the most tedious part of the job.”

Yeah. That landed.

I used to joke that the worst part of being a senior engineer was reviewing other people’s code. All the cognitive load of understanding a system, none of the satisfaction of building it. You’re not creating — you’re auditing. You’re the IRS of software development.

Congratulations. That’s your whole job now.

The Janitor Effect

One commenter called it the “reverse centaur.”

The dream was that AI would be the centaur’s horse — we’d ride it, directing its power, multiplying our capabilities. We’d be the brains, it’d be the muscle.

Instead, we’re the cleanup crew.

Claude writes 400 lines of code in 30 seconds. Impressive. Looks right. Probably compiles. But there’s a subtle bug on line 247 where it’s comparing a string to an integer in a way that JavaScript will happily accept and silently mangle. There’s a race condition in the async handler that only manifests under load. There’s a variable nameddata that shadows another variable nameddata three scopes up.

You know. Junior developer stuff.

Except this junior developer types at 10,000 words per minute and never gets tired. So now you’re reviewing 10x more code per day, and every review requires you to maintain the mental context of codeyou didn’t write.

I spent 20 years building mental maps of codebases. Line by line. Function by function. When you write the code yourself, the map builds automatically. You know why that flag exists because you added it at 3am to fix a production incident. You know that module is haunted because you were there when the haunting began.

When Claude writes the code, you get none of that. You just get the artifact. A fully-formed thing that appeared, Athena-like, from the forehead of a language model. And you have to reverse-engineer the intent from the implementation.

This is debugging someone else’s code.

Forever.

The Uncomfortable Truth

Here’s what nobody wants to say out loud: the implementation was the fun part.

Not the architecture. Architecture is meetings. Architecture is diagrams that nobody reads and Jira tickets that nobody updates. Architecture is important, yes, but it’s notfun.

The fun was the 2am breakthrough. The fun was that moment when the tests finally pass and you understandwhy. The fun was the flow state — that hypnotic trance where hours feel like minutes and you emerge, blinking, having built something that didn’t exist before.

LLMs took that part.

They left us the meetings.

The “Promoted to Manager” Cope

There’s a certain cope that shows up in these discussions. “Well, actually, you’ve been promoted! Now you’re like a tech lead! You’re directing instead of doing!”

Sure. And my 2001 self was “promoted” from game programmer to accountant the moment Excel learned formulas.

Here’s the thing about being promoted: you’re supposed towant it. The tech leads I know who love their jobs? They love mentoring. They love the big-picture thinking. They love watching junior devs grow.

Nobody loves reviewing AI-generated code. The AI doesn’t grow. It doesn’t learn from your feedback. It just generates more code for you to review tomorrow. You’re not mentoring — you’re babysitting. And the baby has unlimited energy and zero object permanence.

What We Actually Lost

Let me be clear: I’m not a Luddite. The productivity gains are real. I ship faster than ever. I build things in hours that would have taken weeks.

But something shifted.

When I built that platformer in 2001, I was a craftsman. Slow, inefficient, probably writing terrible code — but a craftsman. I understood my tools. I understood my materials. I understood, deeply, what I was making.

Now I’m a project manager for a very fast, very unreliable contractor.

The contractor doesn’t care about the code. It has no pride in the work. It optimizes for “looks plausible” rather than “is correct.” It will happily generate the same bug in 15 different files if you don’t catch it in the first one.

And catching it isyour job now. Not building. Catching.

The Question Nobody Wants to Answer

The Reddit thread ends with a question:

“Do you miss the actual act of coding, or are you happy to just be the ‘director’ while the AI does the acting?”

I think about my 2001 self. That kid who spent three weeks understanding platform scrolling. Who felt genuine joy when a rectangle moved across a screen.

Would I trade that experience for “just ask Claude to make a platformer”?

I honestly don’t know.

But I know this: that kid would be horrified by how I work today. Not impressed — horrified. Because to him, the codingwas the point. The game was just an excuse to code.

And now the code is just an excuse to ship.

The Adaptation

Look, I don’t have a tidy conclusion here. The models aren’t getting worse. The productivity isn’t going away. We’re not going back to DJGPP and Allegro and three-week debugging sessions.

Maybe the joy comes back in a different form. Maybe it’s in the architecture, once we learn to love it. Maybe it’s in building the tools that build the tools. Maybe it’s in the meta-game of prompt engineering and workflow optimization.

Or maybe we just mourn quietly and move on.

I’ve gotten good at code review. I’ve built mental models for reading AI-generated code quickly, spotting the common failure modes, knowing where to look for the bugs. It’s a skill. Not the skill I wanted, but a skill.

And sometimes — rarely, but sometimes — I still drop into the code myself. Ignore Claude. Write it by hand. Feel the flow state kick in, just for a moment.

It’s slower. It’s inefficient. It’s probably a waste of time.

But that little rectangle still needs to walk across the screen sometimes. Even if nobody’s watching.

I help developers build, deploy, and distribute their SaaS without hiring a team. If this resonated, you might also likeWhy Your AI Wakes Up Every Morning With No Memory.

]]>

Your Code Quality Doesn't Matter Anymore (And It Never Did)

Lakshmi Narasimhan — Wed, 21 Jan 2026 00:00:00 +0000

A founder on Reddit recently shared that his CTO rebuilt what four third-party partners were providing — using Claude, in weeks, at a fraction of the cost.

Another commenter chimed in: their company replaced $300,000/year software with something they built in-house in under four months.

Meanwhile, over on r/SaasDevelopers, a developer is stuck at $200 MRR for eight months. Beautiful code. Great UX. Fifteen features. Asked where his users come from: “Uh, Product Hunt six months ago and some Reddit posts.”

These two conversations are happening in parallel across the internet, and most developers haven’t connected the dots yet.

Here’s what’s actually happening: AI didn’t just make coding faster. It vaporized the feature moat entirely.

The feature moat was always a lie we told ourselves.

“If I build it better, they will come.” This was comforting. It meant the thing we’re good at — writing code — was the thing that mattered most.

It wasn’t true before AI. It’s aggressively not true now.

Your competitor can rebuild your core features in a weekend. Not because they’re brilliant. Because Claude is sitting right there, and the barrier to “good enough” has collapsed to basically zero. That integration you spent three months perfecting? Someone’s CTO just shipped an 80% version while you were reading this paragraph.

The YC thread frames it well: “AI mostly kills thin feature moats, not real businesses.” If your entire value proposition is “we built this thing and it works,” congratulations — you’ve built something anyone can now replicate before their coffee gets cold.

So what’s actually defensible?

The comments in both threads converge on the same uncomfortable answer: everything except the code.

Distribution. The SaasDevelopers post makes the case bluntly: a mediocre product with great distribution beats a great product with no distribution. Every time. The OP claims $4.8K MRR with “decent features, nothing groundbreaking” because he publishes three SEO posts weekly and engages in five communities daily. His previous products had better code and failed under $500 MRR.

Whether you believe his specific numbers or not, the pattern is real. Visibility compounds. Code quality doesn’t.

Operational complexity. The YC founder pivoted to payments specifically because it’s “harder to clone with AI.” Payments involve regulatory mess, edge cases that actually hurt people when you get them wrong, and trust that takes years to build. You can’t vibe-code your way to PCI compliance.

Workflow embedding. One commenter nailed it: “Can a copycat ship it, but still not get adopted because switching costs and trust are the real barrier?” If yes, you might have something. If your product is a nice UI on top of an API call, you’re a feature waiting to be absorbed.

Data that compounds. This one’s subtle but important. If your product gets better because you have data your competitors can’t easily replicate — user behavior, domain-specific training data, network effects — that’s a moat AI can’t trivially cross.

The developer’s existential crisis.

Here’s the part nobody wants to say out loud: for most technical founders, the skill that got them here is now table stakes.

You can write clean code. Great. So can Claude. You can architect systems. Wonderful. So can a junior dev with Cursor and four hours.

The skills that matter now are the ones developers historically dismissed as “marketing” or “sales” or “that stuff the business people do.”

Building an audience. Writing content that ranks. Engaging in communities without getting banned for being too promotional. Understanding what people actually want to pay for versus what’s technically impressive.

This is deeply annoying if you became a developer specifically to avoid talking to people.

What to actually do.

Stop adding features to a product nobody’s using. That’s not building — that’s procrastinating with a compiler.

Spend less time in your IDE and more time in the places your customers hang out. Reddit, LinkedIn, niche communities, whatever. Not to drop links. To understand what problems people are actually complaining about and whether your thing solves any of them. (It’s why I’m buildingThreadHQ.)

If your product can be rebuilt in weeks with AI, either pivot to something with real operational complexity, or accept that distribution is your product now and code is just the unlock.

The YC thread suggests payments, compliance-heavy industries, anything where “mistakes actually hurt” and trust is earned over years. The SaasDevelopers thread suggests becoming a distribution machine: 20+ platform launches, daily content, systematic visibility.

Both are right. Pick your poison.

The uncomfortable synthesis.

AI commoditized the build. What’s left is everything around it: who knows about you, who trusts you, and how painful it would be to switch away.

The code was never the product. Now it’s just impossible to pretend otherwise.

]]>

The $30/Year Stack for Launching Small Bets

Lakshmi Narasimhan — Mon, 19 Jan 2026 00:00:00 +0000

Every time I launch a new small bet, I need the same boring stuff: professional email, a chat widget, uptime monitoring. The kind of infrastructure that’s completely unsexy but makes you look like you have your act together.

For years, I overcomplicated this. Custom SMTP servers. Self-hosted monitoring. Elaborate setups that took days to configure and broke whenever I looked at them wrong.

Then I realized something: I was spending more time on infrastructure than on validating whether anyone wanted my product.

So I built a repeatable stack. Total cost: about $30-42 per year, per small bet. Here’s the whole thing.

Domain & Hosting: Cloudflare (Free)

Buy your domain wherever you want, but point the nameservers to Cloudflare immediately.

Cloudflare’s free tier is absurd:

DNS management (fast, reliable)
Free SSL certificates (automatic)
DDoS protection
CDN caching
Cloudflare Pages (unlimited sites, unlimited bandwidth)

That last one is key. Your landing page goes on Cloudflare Pages. Connect your repo, push to main, it deploys. No servers. No bills. No thinking about infrastructure when you should be thinking about whether anyone wants your product.

I run every small bet’s landing page on CF Pages. Zero hosting cost.

Email: Google Workspace (The India Pricing Hack)

You want professional email.hello@yourdomain.com, notyourdomain.help@gmail.com like some kind of digital nomad running a dropshipping scam.

Google Workspace direct pricing: $6/month. Painful when you’re running multiple bets.

Google Workspace through an Indian reseller: Rs.125/month. That’s roughly $1.50.

Same product. Same Gmail experience. Same everything. Just… cheaper, because regional pricing exists and Google apparently forgot to close this loophole.

Recommended resellers: Medha Cloud, Host IT Smart, Shivaami. They’re authorized, they’re legit, and they’ll save you $50+/year per domain.

Setup takes 30 minutes: verify domain, add MX records, configure SPF/DKIM/DMARC so your emails don’t land in spam. Done.

Support: Crisp Chat (Free)

Intercom wants $74/month. For a small bet that might make $0.

Crisp’s free tier gives you:

2 team seats (it’s just you anyway)
Unlimited conversations
Mobile app for notifications
A widget that doesn’t look like it was designed in 2008

Copy-paste their script tag into your landing page. Five minutes.

Upgrade trigger: when you have so many support conversations that you need automation. Which means you have customers. Which means you can afford to pay for things.

Monitoring: BetterStack (Free)

Your app will go down at 3am on a Sunday. This is not a prediction, it’s a guarantee.

BetterStack’s free tier:

10 uptime monitors
1GB logs/month
Email and Slack alerts
3-day log retention

Is 3-day retention enough? For a small bet you’re validating? Yes. You’re not running a bank.

Alternative: Axiom gives you 500GB ingest and 30-day retention if you’re logging more aggressively. Also free.

Error Tracking: Sentry (Free)

Your code will throw exceptions in production that never happened locally. Classic.

Sentry’s free tier:

5K errors/month
10K performance transactions
1 user
90-day retention

For a small bet, 5K errors/month is plenty. If you’re hitting that limit, either your app is broken or you have enough users to pay for it.

Database: Supabase (Free Tier or Self-Hosted)

Every small bet needs a database. Supabase’s free tier is genuinely useful:

500MB database
1GB file storage
50K monthly active users
Unlimited API requests

That’s enough to validate most ideas. The catch: you get 2 free projects total. After that, it’s $25/month per project.

For small bets that graduate to real products, I self-host Supabase on a $6/month Hetzner VPS. Full Postgres, auth, storage, realtime — no project limits, no usage caps. (I’m building a service calledSupabyoi to make this dead simple. More on that soon.)

The Complete Stack

Domain — ~$10-15/year
Cloudflare (DNS + Pages) — Free
Google Workspace (India) — ~1.50/month( 1.50/month( 18/year)
Crisp — Free
BetterStack — Free
Sentry — Free
Supabase — Free

Total: ~1.50/month, 1.50/month, 30-42/year

That’s DNS, hosting, professional email, live chat, uptime monitoring, error tracking, and a database for less than a single month of most “startup” tools.

The Rules

Don’t upgrade until you have paying customers. Free tiers exist for validation. Use them.

Keep the setup identical across bets. Same tools, same patterns, same DNS records. You should be able to launch a new bet’s infrastructure in an afternoon, not a weekend.

Resist the urge to self-host. Yes, youcan run your own mail server. You can also perform your own dental surgery. Neither is advisable.

When To Actually Upgrade

Google Workspace — You need >30GB storage → $7/mo
Crisp — You need chatbots or >2 team members → $25/mo
BetterStack — You’re pushing >1GB logs/month → $24/mo
Sentry — You’re hitting 5K errors/month → $26/mo
Supabase — You need >2 projects or more storage → $25/mo (or self-host)

Notice a pattern? These are all “you have real traction” problems. Good problems to have.

What’s Not Covered (Yet)

This is the skeleton — the basic infrastructure every small bet needs from day one.

I’ll cover these in separate posts:

Tech stack choices (frameworks, languages, deployment)
Payment processing (Stripe, Lemon Squeezy, regional considerations)
CI/CD pipelines (GitHub Actions, deployment automation)
Landing page patterns (what actually converts)

One thing at a time.

The Point

Infrastructure should be invisible. It should cost almost nothing while you’re validating. It should scale up only when you have revenue to pay for it.

$30/year per bet means you can run 10 small bets for less than most people pay for a single Notion subscription.

Stop building infrastructure. Start shipping products.

This is part of my “Deploy” series — simple infrastructure patterns for solo operators who’d rather build products than manage servers.

]]>

90% of Programming Skills Just Got Commoditized. The Other 10% Is Worth 1000X More.

Lakshmi Narasimhan — Thu, 15 Jan 2026 00:00:00 +0000

Andrej Karpathy recentlywrote something that’s been rattling around my head:

“I’ve never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become available over the last year and a failure to claim the boost feels decidedly like skill issue.”

He then listed what this new layer looks like: agents, subagents, prompts, contexts, memory, modes, permissions, tools, plugins, skills, hooks, MCP, LSP, slash commands, workflows, IDE integrations.

His conclusion: “Clearly some powerful alien tool was handed around except it comes with no manual and everyone has to figure out how to hold it and operate it, while the resulting magnitude 9 earthquake is rocking the profession.”

I felt this in my bones.

The Old Stack vs The New Stack

The old programming stack was hard enough:

Hardware -> OS -> Language -> Frameworks -> Your Code

Years of learning. Layers of abstraction. But at least it wasdeterministic. At least there were manuals. At least Stack Overflow had answers.

The new stack adds a layer on top:

You -> Prompts/Agents/Context/Memory/Tools/Modes -> Code

This layer is fundamentally different. It’s stochastic. It’s fallible. It’s unintelligible. And it changes every few weeks.

There’s no certification. There’s no textbook. There’s no “Effective AI Orchestration” by Joshua Bloch. Just a bunch of people figuring it out in Discord servers and sharing CLAUDE.md files and best practices like trading cards.

The Divide Is Already Here

Someone on Redditdescribed the pattern they’re seeing on their team:

“Two developers with similar experience working on similar tasks, but one consistently ships features in hours while the other is still debugging. At first I thought it was just luck or skill differences. Then I realized what was actually happening — it’s their instruction library.”

They’re watching an underground collection of power users share workflows like secrets:

Commands that automatically debug entire codebases
CLAUDE.md files that turn Claude into domain experts
Slash commands that turn 45-minute processes into 2-minute ones

Meanwhile, most people are still typing “help me fix this bug” and wondering why their results suck.

As one developer put it: “The differences between someone who opens up CC for the first time and someone with tuned md files is beyond night and day.”

The Skill Issue Is Real (But Not The One You Think)

Here’s what hit me about Karpathy’s framing: he called it a “skill issue.”

Not a tools issue. Not an access issue. Not a funding issue.

Askill issue.

The 10X boost exists. The leverage is real. But claiming it requires mastering something that didn’t exist two years ago and has no curriculum.

Someone in that same thread nailed the uncomfortable truth: “90% of traditional programming skills are becoming commoditized while the remaining 10% becomes worth 1000x more. That 10% isn’t coding — it’s knowing how to architect AI workflows.”

The irony is brutal. We spent years mastering syntax, frameworks, design patterns. Now an AI can generate all of that in seconds. What itcan’t do is orchestrate itself effectively. That’s your job now.

What The New Layer Actually Looks Like

Let me make this concrete. Here’s what I’ve had to learn in the past year that wasn’t part of any CS curriculum:

CLAUDE.md Architecture

Your instructions file isn’t documentation. It’s programming. The structure, the phrasing, what you include vs exclude — these decisions compound across every interaction. A well-architected CLAUDE.md is worth more than a well-architected codebase.

Context Management

Every token matters. MCP servers eat context. Long conversations drift. You need to think about what Claude knows, what it’s forgotten, when to compact, when to start fresh. It’s memory management, but for a mind that isn’t yours.

Prompt Design

Not “prompt engineering” in the LinkedIn-influencer sense. Actual design. When do you give examples? When do you constrain? When do you let it explore? How do you phrase things so it doesn’t hallucinate? How do you trigger deeper thinking? These are learnable skills with massive payoff differences.

Tool Orchestration

MCP, skills, hooks, slash commands. Which tool for which job? When does an MCP server make sense vs a bash script vs a skill file? How do you chain them? How do you debug when the chain breaks?

Mode Awareness

Plan mode vs implement mode. When to let Claude explore vs when to constrain. When to use subagents. When to go linear. Themeta of working with AI — knowing when to switch approaches — is itself a skill.

Verification Choreography

AI generates fast. Verification is the bottleneck. How do you structure your workflow so you’re not just rubber-stamping garbage? How do you catch the 8 production bombs before they ship? (Yes,I wrote about this.)

The Manual That Doesn’t Exist

An older developer on Redditcaptured the frustration:

“I started using AI about 2 years ago. I thought I was doing good, but then I started seeing all this stuff about MCP servers, md files etc and I am kind of lost. I want to learn more and I want to improve my AI skills but it’s difficult for me.”

This is someone with decades of experience, feeling lost because the new layer has no onramp.

The manual doesn’t exist because the platform keeps shifting. Claude Code ships updates weekly. New features appear. Old patterns stop working. The MCP ecosystem is exploding. Skills just launched. Hooks changed. The ground won’t stop moving.

You can’t study for an earthquake. You can only practice surfing.

How I’m Learning (Imperfectly)

I don’t have this figured out. Nobody does. But here’s what’s working:

Steal shamelessly. Find people who are clearly more productive and reverse-engineer their setup. Their CLAUDE.md files, their slash commands, their workflows. GitHub repos, Discord servers, Reddit threads. The good stuff is scattered but findable.

Treat your setup as code. Version control your CLAUDE.md. Iterate on your slash commands. When something works, document why. When something fails, autopsy it. Your instruction library is a codebase now.

Invest in meta-skills. The specific tools will change. MCP might get replaced. Claude Code might get competition. But the meta-skills — context management, prompt design, verification choreography — those transfer.

Actually use the new features. Hooks exist. Subagents exist. Skills exist. Most people ignore them because they’re “advanced.” They’re not advanced. They’re just new. The learning curve is the moat.

Teach to learn. Writing about this forces me to understand it. Explaining my setup to others reveals the gaps. The best way to master the new layer is to articulate it.

The Uncomfortable Conclusion

Karpathy is right. There’s a 10X boost available. Failing to claim it is a skill issue.

But it’s anew skill. One that didn’t exist before. One that has no manual, no certification, no clear path.

The people figuring it out are building compound advantages. Every custom command, every refined CLAUDE.md pattern, every workflow optimization — it all stacks. The gap between those who master the new layer and those who don’t is widening fast.

The earthquake is still happening. The alien tool is still being figured out. The manual is being written in real-time by the people using it. And the rules are being changed as we speak/code/write.

Roll up your sleeves.

This is a companion to my previous essay onwhat Claude can’t do for you. That one covered the old skills that still matter. This one covers the new skills you need to add.

]]>

Claude Code Is Incredible. It Also Almost Shipped 8 Production Bombs Last Week.

Lakshmi Narasimhan — Mon, 12 Jan 2026 00:00:00 +0000

What 15 years of production scars are still good for

Last week I caught eight bugs across three projects. Not typos. Not missing semicolons. Real, ship-breaking, production-melting problems that would have sailed right past code review and into the waiting arms of actual users.

Claude Code wrote the code. Claude Code passed the tests. Claude Code would have happily deployed it.

And Claude Code had no idea anything was wrong.

I’ve written aboutcomprehension debt and argued thatclean code is dead. But here’s the uncomfortable third act: even if you understand your specs perfectly and verify your outcomes ruthlessly, there’s a whole category of knowledge that AI simply doesn’t have.

Call it production intuition. Call it battle scars. Call it “I’ve been burned by this exact thing before.”

Whatever you call it, you can’t prompt your way into it.

The Localhost Delusion

Here’s what AI is optimized for: making code that works on your machine, right now, with your current data, under ideal conditions.

Here’s what AI is catastrophically bad at: imagining your code running on three replicas behind a load balancer at 3am when the database is under pressure and someone’s running a batch job that nobody documented.

A Fortune 100 developer on Redditput it bluntly: “There’s a lot of vibe coded slop that works well for MVP but will absolutely fall apart under stress and within production environments once they scale to more users. It doesn’t really reveal itself until later when it’s much more difficult to fix.”

Later. When it’s difficult. The horror.

Let me walk you through what “later” looked like for me this week.

Pattern 1: Production Blindness

The Concurrency Landmine

I’m building a tool that routes MCP calls. Claude wrote it in Python — clean, well-structured, exactly what I asked for. Worked beautifully in testing. One request, one response, everybody’s happy.

Then I imagined what happens when three Claude Code sessions hit it simultaneously.

Oh.

Oh no.

Python’s threading model and the phrase “parallel requests” get along about as well as cats and bathtubs. Claude’s solution? More Python. Refactor this, optimize that. Very confident. Would’ve worked for lower concurrency.

But I knew this thing needed to handle dozens of parallel sessions. I prompted it to rewrite in Go. Claude nailed the port — goroutines, channels, the works. Problem solved in an afternoon.

The code was excellent. The language selection wasn’t. Claude optimizes brilliantly within the box you give it. It just won’t question whether you’re in the right box.

The Replicas Problem

Auth rate limiting. Claude implemented it in-memory with a clean sliding window algorithm. Textbook correct. Tests pass. Ship it.

One replica: perfect.
Two replicas: every user gets double the rate limit.
Three replicas: chaos.

This is distributed systems 101. The kind of thing you learn after you’ve been paged at 2am because someone figured out they could hit your API from three different IPs and get 3x the rate limit.

When I pointed this out, Claude immediately suggested Redis-backed rate limiting with proper distributed locking. Great solution. But it didn’t think about replicas until I did. Claude builds for the deployment model you describe. If you don’t describe it, localhost is the default.

The Sync Task Landmine

An API endpoint runs a long-running task. Claude implemented it synchronously — straightforward, easy to understand, does exactly what the tests verify.

Deploy it. First user clicks the button. 30-second timeout. 504 Gateway Timeout.

When I explained the problem, Claude refactored it to Celery with proper task queuing, retry logic, and status polling. Solid implementation. Took maybe 20 minutes.

But here’s the thing: I only caught it because I tested the actual user flow, not just the unit tests. Claude implemented what I asked for. I didn’t ask for “an endpoint that won’t timeout in production.” That’s on me. But it’s also the kind of thing I’ve learned to check after watching synchronous endpoints die approximately 47 times.

Pattern 2: Ecosystem Amnesia

The Deprecation You Won’t Find on Stack Overflow

Supabase deprecated their API keys. Not in a big announcement. Not in the docs you’d naturally read. In aGitHub discussion with 43 comments and a lot of confused developers.

I found out because I read the discussion. I then had to fix three separate projects.

Claude doesn’t read GitHub discussions. Claude doesn’t know what the community is grumbling about. Claude’s knowledge is frozen in time, and the ecosystem keeps moving.

The “Works But Wrong” Framework Choice

Claude encrypted secrets at rest using Fernet keys. Technically correct. Cryptographically sound. Tests pass. Secure enough.

But I’m using Supabase. Supabase has a vault feature built specifically for this. When I mentioned it, Claude migrated everything over cleanly — proper RLS policies, the works.

The Fernet implementation wasn’twrong. It just wasn’t theright choice for this stack. Supabase vault means one less thing I manage, one less key rotation I handle, one less piece of infrastructure to think about.

Claude doesn’t know the zeitgeist of “what’s the idiomatic way to do this in Supabase.” That knowledge lives in community forums, Discord servers, and the muscle memory of people who’ve shipped Supabase apps before.

The Build System That Wasn’t

I made UI mockups with Tailwind CSS(using Claude, to be fair). Told Claude to use them. Claude happily served Tailwind from a CDN.

In development? Fine.
In production? Every page load fetches the entire Tailwind library. Uncompiled. Unoptimized. Approximately 300KB of CSS you don’t need.

Claude knows what Tailwind is. Claude doesn’t know that real projects compile it. That’s the kind of tribal knowledge you pick up by shipping things and watching your Lighthouse scores crater.

Pattern 3: Verification Vacuum

The Cache That Wasn’t

Database queries were getting slow. I asked Claude to add a caching layer. Claude wrote a beautiful Redis caching module — proper TTLs, cache invalidation on writes, the works. Tests for the module passed. I shipped it, watched the deployment go green, felt the warm glow of productivity.

The cache wasn’t being hit.

Claude built an excellent caching module. Claude did not check whether the actual query functions werecalling the cache. The module worked perfectly. Nothing was using it. Every request still hit the database directly.

I discovered this by checking Redis after a few hundred requests. Empty. Revolutionary debugging technique, I know.

Could tests have caught this? Sure. Integration tests that verify “when I make this API call, Redis gets a cache entry.” But I’d need to know to write that test. Claude wrote unit tests for the caching module. I should have asked for end-to-end verification. The knowledge that “modules can exist without being properly wired up” is experience. Pattern recognition. The scar tissue from shipping features that weren’t actually features before.

Pattern 4: Architectural Judgment

The Multitenancy Time Bomb

A project using Qdrant vector database. Users store embeddings. Multiple users. Shared infrastructure.

The question that should wake you up at night: Can User A see User B’s data?

Claude’s implementation used collection-level isolation with proper tenant IDs in the filter queries. Reasonable approach. Worked in my tests.

But a thorough multitenancy review? The kind where you trace every query path, every edge case, every possible way data could leak between tenants? Where you think about what happens if someone forgets to pass the tenant filter? Where you consider whether the default behavior is secure-by-default or insecure-by-default?

That review took me two hours. Claude can helpexecute the fixes I identify, but it won’t spontaneously think “hey, multitenancy is a critical architectural decision that deserves paranoid scrutiny.”

Get multitenancy wrong and you’re on the front page of Hacker News, and not in the good way. Claude builds features. You build threat models.

What The Discourse Gets Wrong

Here’s what bothers me about the AI discourse: both sides are missing the point.

The AI skeptics say “AI code is garbage, don’t use it.” That’s wrong. Claude Code is incredibly useful. I ship faster. I handle complexity I couldn’t handle alone. The leverage is real.

The AI evangelists say “AI will replace developers, just describe what you want.” That’s also wrong. Describing what you want is theeasy part. Knowing what youshould want — that’s where the experience lives.

Someone on r/ExperiencedDevsnailed it: “Coding is the boring/easy part. Typing is just transcribing decisions into a machine. The real work is upstream: understanding what’s needed, resolving ambiguity, negotiating tradeoffs, and designing coherent systems.”

The developers who thrive aren’t the ones who write the most code. They’re the ones who catch the multitenancy bug before it ships. Who know that in-memory rate limiting won’t scale. Who’ve been burned by synchronous endpoints and CDN-served CSS and proxies that aren’t wired up.

Experience isn’t knowing the syntax. Experience is knowing the failure modes.

What’s Actually Worth Learning

So what’s worth learning when AI can write the code?

Production Thinking

This is the big one. Every example above comes back to it: Claude builds for localhost, you build for production.

Concretely, this means developing instincts for questions like:

“What happens when there are multiple replicas?” (Rate limiting, session state, caching, file storage — anything in-memory becomes a distributed systems problem)
“What happens under load?” (Synchronous operations become timeouts. N+1 queries become database meltdowns. That “fast enough” endpoint becomes a bottleneck)
“What happens when dependencies fail?” (Database is slow. External API is down. Redis is unreachable. Do you degrade gracefully or explode?)
“What happens at 3am when nobody’s watching?” (Background jobs. Retry logic. Dead letter queues. The things that fail silently)

How do you learn this? You can’t shortcut it. You deploy things. You watch them break. You read post-mortems. You get paged. You develop a paranoid imagination for failure modes.

But you can accelerate it: before you ship, spend 10 minutes imagining the deployment. Draw the boxes. How many instances? What’s in front of them? Where’s the state? What’s shared? This exercise catches 80% of the issues I described above.

Ecosystem Intuition

This is knowing thezeitgeist of your stack — not just what’s possible, but what’s idiomatic. What the community actually uses. What’s deprecated but still in the docs. What’s new but not proven.

Concretely:

Read the GitHub discussions, not just the docs. That’s where deprecations get announced, migration paths get debated, and footguns get documented.
Follow the maintainers on Twitter/X or Bluesky. They’ll tell you about breaking changes before the docs catch up.
Lurk in Discord servers. The “how should I do X” discussions reveal what’s considered best practice.
Actually ship with the stack. The difference between “I’ve read about Supabase” and “I’ve shipped three apps with Supabase” is enormous.

The goal: when Claude suggests an approach, you can immediately sense whether it’s the “right” way or just “a” way. Fernet encryption vs Supabase vault. CDN Tailwind vs compiled Tailwind. Redis rate limiting vs in-memory rate limiting. These aren’t in the documentation. They’re in the collective experience.

Architectural Paranoia

Some decisions are easy to change later. Some aren’t. Knowing the difference is half of senior engineering.

The ones that are hard to reverse:

Multitenancy model: Shared database with tenant IDs? Separate schemas? Separate databases? Choose wrong and you’re rewriting everything.
Auth architecture: Where do tokens live? How do sessions work? What’s the refresh flow? Changing this later breaks every client.
Data model fundamentals: Relational vs document. Normalized vs denormalized. Adding a column is easy. Restructuring your entire data model is not.
API contract design: Once clients depend on your response shape, changing it is a versioning nightmare.

For each of these, Claude will happily implement whatever you ask. It won’t stop and say “are you sure about this? This is hard to change later.” That paranoia is your job.

My rule: for any architectural decision I can’t easily reverse, I spend at least an hour thinking about alternatives before I let Claude write the first line.

Verification Instincts

What should you test for? What’s easy to get wrong? Whatlooks done but isn’t actually wired up?

This is pattern recognition from past failures. The cache that wasn’t being hit. The feature flag that was never checked. The error handler that swallowed exceptions silently.

Concretely:

Test the user flow, not just the units. My caching module passed all its unit tests. The integration was broken. If I’d tested “make this API call and verify Redis has an entry,” I’d have caught it immediately.
Verify your assumptions. Claude wrote the code, but did the code actually getused? Add a log line. Check the network tab. Confirm reality matches intention.
Break it on purpose. What happens when you pass invalid input? What happens when the database is slow? What happens when the auth token is expired? Claude tests the happy path. You test the sad path.

The underlying skill: developing a checklist of “things that can look done but aren’t” for your specific domain. Every time you get burned, add it to the list. Eventually, you check these instinctively.

The Uncomfortable Conclusion

A freelance developer with 8 years of experiencedescribed a pattern he’s seeing across multiple clients: companies paying good money for internal software that barely works. Same symptoms every time. AI-generated comments. Algorithms that make no sense. Inconsistent patterns.

“Yes it mostly works,” he wrote, “but does so terribly to the point where it needs to be fixed.”

The era of AI slop cleanup has begun. And the people doing the cleanup are the ones who know what production actually looks like.

Claude builds for localhost. You build for production.

That gap is where your fifteen years live. And it’s not getting smaller.

This is the third essay in an accidental trilogy. First:comprehension debt is real. Second:clean code is dead. This one: the skills that matter more now, not less.

]]>

Clean Code Is Dead. Long Live Clean Specs.

Lakshmi Narasimhan — Fri, 09 Jan 2026 00:00:00 +0000

Steve Yegge shipped 225,000 lines of Go code he’s never read.

Let that sink in.

Beads — his coding agent memory system — is used by tens of thousands of developers daily. It’s 100% vibe coded. Yegge has never looked at a single line. Same with his new project,Gastown. Three weeks old, 100% vibe coded, never seen the code, never plans to.

His reaction to anyone uncomfortable with this? “Get out now.”

The Heresy

For two decades, we’ve been taught that code is literature. Uncle Bob’s Clean Code. Martin Fowler’s Refactoring. Elegant variable names. Single responsibility. Code should read like prose.

We optimized for human comprehension because humans had to maintain it.

But what if that’s no longer true?

Simon Hoiberg put it bluntly: “Half my code is now written by AI, and the other half is read by AI to fix bugs. Optimizing for human readability is becoming pointless.”

The audience for your code has changed. And it’s not you anymore.

The Other Day I Wrote About Comprehension Debt

I argued that vibe coding creates legacy code from day one. That velocity without comprehension isn’t velocity — it’s procrastination with extra steps.

I still believe that. Mostly.

But here’s the uncomfortable follow-up question: What if comprehension debt only matters whenyou have to pay it?

If AI writes the code and AI debugs the code and AI refactors the code… who exactly needs to understand it?

The New Contract

The old contract: Write clean code so humans can read it.

The new contract: Write code that produces correct outcomes, verified by tests that humans can understand.

This is a crucial shift. The code becomes disposable infrastructure. The tests become the spec. The behavior becomes the product.

Steve Yegge doesn’t need to understand 225,000 lines of Go. He needs to understand what Beads should do. The tests verify that it does it. The code is just… implementation detail. An artifact. A byproduct.

Clean Specs > Clean Code

Here’s the heretical thought experiment:

What if “clean code” principles should now apply to your specifications instead of your source code?

Think about it:

Readable intent: Your specs should be crystal clear. “Users can checkout with valid payment. Invalid cards show an error. Empty carts can’t checkout.”
Single responsibility: Each spec describes one behavior. Not implementation — behavior.
Self-documenting: Specs are the documentation that gets executed. They describe what the system should do, and you verify it actually does.
Easy to modify: When requirements change, you update the spec first. AI updates everything else.

The source code can be a tangled mess of AI-generated spaghetti. Who cares? If you can clearly specify what you want and verify you got it, the implementation is just a detail.

The Yegge Paradox

Here’s what’s wild. In the Vibe Coding book Yegge co-authored with Gene Kim, “Steve” is described as reviewing 10,000 lines of code a day, throwing away 10 lines for every line kept.

Wait. He reviews code? I thought he never looks at it?

The answer, I think, is this: He reviewsoutcomes. He reviews test results. He reviews whether the thing works. He’s not reading code for elegance or comprehension. He’s running it, breaking it, verifying it.

The code review has become a behavior review.

What This Means for You

I’m not saying burn your Clean Code book. (Okay, maybe I am. That thing is 400 pages of what could’ve been a blog post.)

But consider this workflow:

Specify the behavior — in plain language. “Users can checkout with valid payment. Invalid cards show an error. Empty carts can’t checkout.”
Let AI write the tests — it turns your specs into executable verification
Let AI write the implementation — who cares if it’s ugly
Verify the outcomes — does it do what you specified? Try to break it. Edge cases covered?
Ship it — the code is a means to an end

If something breaks, you don’t debug the code. You describe the broken behavior. AI writes a failing test. AI fixes the implementation. You verify the outcome. You never had to understand the implementation. You just had to understand what you wanted.

The Catch

There’s always a catch.

This only works if your specifications are actually good. If your specs are vague, incomplete, missing edge cases — you’re in the worst of both worlds. Incomprehensible code that doesn’t even do what you need.

That’s not vibe coding. That’s vibes-all-the-way-down coding. And that’s how you get 18 out of 20 CTOs reporting production disasters.

The discipline has to go somewhere. If you’re not putting it into clean code, you damn well better be putting it into clear specifications and ruthless outcome verification.

The Real Skill Shift

Old skill: Writing elegant, maintainable code that other humans can understand.

New skill: Specifying behavior precisely and verifying outcomes ruthlessly.

The developers who thrive won’t be the ones who write the cleanest code. They’ll be the ones who can articulate exactly what they want. Who can break their own systems. Who can look at a feature and immediately think of ten ways it could fail.

Code literacy is becoming specification literacy. The new “clean code” is clear intent.

The Uncomfortable Conclusion

We spent twenty years optimizing for human readers who are increasingly being replaced by AI readers.

Maybe Steve Yegge is right. Maybe the code doesn’t matter. Maybe it never really mattered — we just didn’t have anything better.

What matters is: Does it work? Can you prove it? Can you verify it still works after changes?

Clean code was a proxy for those questions. A good heuristic when humans had to debug.

Clean specs answer those questions directly.

The code is dead. Long live the specs.

This is a follow-up to myrecent essay on comprehension debt. The tension is real: you need to understand the problem deeply enough to specify it clearly, but maybe not the implementation at all. Where that line is… I’m still figuring out.

]]>

I Found a Cryptominer in My Client's Production Cluster. Claude Code Found the Attacker.

Lakshmi Narasimhan — Sat, 03 Jan 2026 00:00:00 +0000

New Year’s Day. Coffee in hand. Ready to ease back into work.

Then I saw the logs.

2026-01-02T06:34:27 GET xmrig-6.24.0-linux-static-x64.tar.gz
2026-01-02T06:34:30 GET http://37.32.6.33:7979/m
2026-01-02T06:34:30 spawn /opt/systemf/m ENOENT

xmrig. In production. Someone was mining Monero on my client’s Kubernetes cluster.

The horror.

The Investigation

I had a few hundred megabytes of JSON logs and approximately zero patience for manually correlating timestamps. So I did what any reasonable person would do: I asked Claude Code to analyze the logs and figure out what triggered the miner download.

Within seconds, it built a timeline:

Time Event

06:34:26. Normal request to /onboarding

06:34:27. xmrig downloaded from GitHub

06:34:30. Secondary payload from sketchy IP

06:34:57. Container OOMKilled

The cryptominer was so resource-hungry it consumed 2GB of memory in 30 seconds and crashed the container. Ironic. The attacker’s greed saved us from a prolonged compromise.

But how did they get in?

Chasing Red Herrings

Claude Code’s first suspect: a low-version npm package calleddevice-unique-keygen. Added by a developer whose email matched the package maintainer. Classic supply chain attack pattern.

I got excited. Maybe too excited.

Claude Code fetched the GitHub repo, analyzed the source code, checked for postinstall scripts, looked for obfuscated code, searched for eval() calls.

Nothing. The package was clean. Just a browser fingerprinting library. Boring. Legitimate.

We moved on.

No malicious init containers. No sidecars. No .ashrc shenanigans. The Dockerfile was clean. The pod spec was clean.

Everything was clean except someone was definitely mining crypto on our infrastructure.

The Actual Answer

Claude Code rannpm audit on the codebase.

critical │ Next.js is vulnerable to RCE in React flight protocol
Package │ next
Patched │ >=15.3.6
Your ver │ 15.3.4
CVSS │ 10.0

CVSS 10. The maximum possible score. The “your house is actively on fire” of security ratings.

The app was running Next.js 15.3.4. A publicly disclosed RCE vulnerability. No authentication required. An attacker could run arbitrary commands on the server by sending a crafted request.

That’s exactly what happened. They sent a request, ran wget twice, downloaded the miner, and started extracting crypto value from compute cycles they weren’t paying for.

The container’s memory limit stopped them. A $20/month Kubernetes resource limit prevented what could have been ongoing theft.

What Claude Code Actually Did

I want to be clear about what happened here. I didn’t single-handedly unravel a sophisticated attack. I didn’t manually correlate log timestamps or reverse-engineer obfuscated npm packages.

I said “check these logs” and Claude Code:

The entire investigation took under an hour. Not because I’m fast. Because Claude Code is.

The Fix

pnpm update next@^15.3.6

One command. That’s the remediation for a CVSS 10.0 vulnerability.

We also orphaned the compromised pods for forensic analysis, rotated secrets, and added proper security contexts to prevent future wget adventures.

The Lesson

Two things saved us:

One thing would have prevented this entirely: runningnpm audit before deployment.

The attacker exploited a vulnerability that was publicly disclosed and patched. We just hadn’t updated yet.

Godspeed with your own dependency updates.

My Medium friends can read this over there as well.

]]>

I Spent Weeks Confused About Claude Code's 5 Concepts. Here's the Mental Model That Finally Clicked.

Lakshmi Narasimhan — Thu, 11 Dec 2025 00:00:00 +0000

Slash commands, skills, agents, MCP servers, plugins. Five concepts. Five different jobs. One very confused developer (me) trying to figure out which one to use when.

You’re using Claude Code. You’ve seen these terms thrown around like confetti at a developer conference. Slash commands! Skills! Agents! MCP servers! Plugins! Each one sounds important. Each one sounds slightly different from the others. Each one makes you wonder if you’re using Claude Code wrong.

Spoiler: you probably are. But so is everyone else, so don’t feel too special about it.

Here’s the mental model that finally made sense to me after weeks of confusion and several existential crises about whether I understood my own tools.

The one-sentence version

Slash commands = shortcuts you trigger manually (like a civilized person)

Skills = instructions Claudemight trigger automatically (emphasis on “might”)

Agents = autonomous workers with their own context (little worker bees you send off to do your bidding)

MCP servers = external capabilities like browsers, databases, APIs (the things that let Claude actuallydo stuff in the real world)

Plugins = packaging that bundles any combination of the above (a zip file for your AI workflow, basically)

That’s it. Five concepts. Five different jobs. The confusion happens because they overlap like a Venn diagram designed by someone who hates clarity. A slash command can spawn an agent. An agent can use MCP servers. A plugin can contain all of the above. It’s turtles all the way down.

(I’m not coveringhooks here — automations that fire on events like file saves. That’s a whole other therapy session.)

The key distinction most people miss

Here’s what nobody tells you upfront:who decides when something runs?

Slash commands:You trigger them. Like pressing a button. Revolutionary concept.
Skills:Claude triggers them. In theory. When it feels like it. Maybe.
Agents:You spawn them, then they run autonomously until they’re done or your tokens are.
MCP servers:Claude calls them when it needs to reach outside your codebase.
Plugins:You install them. They’re just containers.

This matters more than all the technical mumbo-jumbo. Want control? Slash commands. Want Claude to figure it out? Skills. Want to let something loose and hope for the best? Agents. Want Claude to actually interact with the real world? MCP servers.

The decision tree (for people who don’t want to think about this anymore)

When you’re about to ask Claude to do something, run through this:

Is it a repeatable task with fixed steps? → Slash command. Done. Move on with your life.

Does it need to access external systems? → MCP server. Claude can’t browse the web or query databases with pure thought. Yet.

Does it require exploration and figuring things out? → Agent. Let it wander. It’s smarter than you think. Sometimes.

Is it domain-specific instructions that don’t always apply? → Skill. Good luck getting Claude to actually use it without being asked.

Do you want to share your setup with others? → Package it as a plugin. Make it someone else’s problem.

The overlap problem (or: why everyone builds three things for the same task)

Here’s what happens in the wild: developers build a slash command, a skill, AND an agent for the same task. I’ve done it. You’ve probably done it. We’ve all sinned.

Pick one primary approach:

Needcontrol over when it runs → slash command
NeedClaude to decide when it’s relevant → skill (and a prayer)
Needautonomous multi-step execution → agent
Needexternal system access → MCP server
Needto share your setup → plugin

Use the others to support, not duplicate. Your future self will thank you when you’re not debugging three different implementations of the same thing at 2 AM.

How they layer (the actually useful part)

Think of it as a stack:

Skills = instructions (how to do things)
Slash commands = triggers (entry points you control)
Agents = workers (autonomous task executors)
MCP servers = capabilities (external system access)
Plugins = packaging (bundles of all the above)

A slash command can spawn an agent. An agent can use MCP servers. A skill can teach Claude how to use an MCP server efficiently. A plugin can package all of this into something you can share on GitHub and pretend makes you a thought leader.

They compose. They don’t compete. Unless you make them compete, in which case, godspeed.

This is post 1 of 4. Next up: slash commands vs skills — and why skills don’t work the way the documentation promises they will.

I help technical founders develop, deploy, and market their SaaS using Claude Code. This is the kind of workflow clarity I wish someone had given me three months ago.

]]>

I spent years on Kubernetes. Now I'm betting against it.

Lakshmi Narasimhan — Thu, 04 Dec 2025 00:00:00 +0000

I’ve spent years in the Kubernetes ecosystem. I wrote about K3s. I ran production clusters. I know my way around kubectl, Helm charts, and the CNCF landscape.

And I’m building a deployment tool that doesn’t use any of it.

Here’s why.

Kubernetes solves problems you don’t have

K8s is incredible engineering. It solves real problems:

Multi-team deployments without stepping on each other
Automatic failover across dozens of nodes
Fine-grained resource allocation at massive scale
Rolling updates for services with thousands of instances

If you’re Spotify, you need this. If you’re running a 50-person engineering org, you need this.

If you’re a solo dev with one FastAPI app and a Celery worker? You don’t.

As one dev put it: “Do you want to build a product, or do you want to build an infrastructure team? Kubernetes makes sense for the latter, but it’s often overkill for the former.”

You need:

git push → app is live
Rollback when you break something
Logs you can actually read
Alerts when the site goes down

That’s it. Everything else is ceremony.

The hidden cost isn’t the cluster

“But K3s is lightweight! You can run it on a $6 VPS!”

True. I’ve done it. Here’s what they don’t tell you:

A solo devrecently posted on r/kubernetes with a title that said it all: “Solo dev tired of K8s churn… What are my options?”

His pain point wasn’t learning Kubernetes. It was the maintenance:

“I don’t mind learning the topics and writing the config, I do mind having to deal with a lot of work out of nowhere just because the underlying tools are beyond my control and requiring breaking updates.”

He’d been burned by Bitnami charts pulling the rug, NGINX ingress breaking changes. Things that worked stopped working — not because he changed anything, but because the ecosystem did.

“It all felt very straightforward, and it worked so well for a bit, but it starts to crumble even when I haven’t changed anything on my side.”

This is the hidden cost. Not the setup — the churn.

The YAML tax: Every change requires editing manifests. Add an env var? YAML. Change a port? YAML. Want a cron job? That’s a whole new CronJob resource. One team had a production outage caused by an improperly indented YAML line. A single space broke prod.

The debugging tax: Something’s wrong. Is it the pod? The service? The ingress? The network policy? The PVC? Hope you remember how to readkubectl describe.

The upgrade tax: K3s made this easier, but you’re still running a distributed system. A 2024 report found over 77% of Kubernetes practitioners still have issues running their clusters — up from 66% in 2022. It’s getting harder, not easier.

The cognitive tax: Part of your brain is always allocated to “how does Kubernetes work” instead of “how do I ship features.”

As one commenter put it: “Choose your churn.” There’s always something.

The Reddit OP’s conclusion? He gave up on K8s entirely. Settled on plain NixOS on a single Hetzner VPS. Accepted that 99.9% uptime from one server is good enough. Skipped the redundancy he thought he needed.

“I am trying to write my software, I just want a reliable thing to host it with the freedom and reliability that one would expect from a system that stays out of your way.”

That’s the real ask. A system that stays out of your way.

For teams, the Kubernetes tax is worth paying. You split it across people, you build expertise, you amortize the cost.

Solo? You pay it all yourself, every time.

What actually works for solo devs

So if not Kubernetes, what?

The same Reddit OP nailed the PaaS problem too:

“These ‘managed-docker’ services charge per container/pod and force the user to over-provision. Your pod doesn’t run on 250mb RAM? Ok pay for 1GB even though you only need 500mb.”

I’ve tried everything:

Heroku (great until the bill hits)
Railway/Render (same story, nicer UX — $50-100/mo for what costs $5 on a VPS)
Dokku (solid, but showing its age)
Coolify (powerful, but now you’re babysitting another server)
K3s (overkill for most solo projects)
Raw Docker + nginx (works but tedious)

The best setup I’ve found:Kamal.

It’s from 37signals. They run Basecamp and HEY on it. It’s just Docker + SSH. No cluster, no orchestrator, no YAML manifests.

kamal deploy

That’s it. It SSHs into your server, pulls your container, does a zero-downtime swap. Rollback is one command. Logs are one command.

It’s boring. It works.

My bet: AI interface > dashboards > CLI > YAML

Here’s where it gets interesting.

Kamal solved the “deploy” problem. But ops is more than deploy:

Why is the app slow right now?
What happened at 3am?
Should I upgrade my VM or optimize my code?
Show me the errors from the last hour

These questions require jumping between tools. SSH into the box, grep the logs, check Grafana, cross-reference with your deploy history.

My bet: you shouldn’t need to do any of that.

You should just ask.

“Why is memory usage spiking?” → Here’s what’s using RAM, and here’s the trend over the last week.

“Roll back to yesterday’s deploy” → Done. Here’s what changed.

“Show me errors from the /api/checkout endpoint” → Found 47 errors, here’s the pattern.

This isn’t science fiction. LLMs are good at this now. The interface just doesn’t exist yet.

What I’m building

VMKit is my attempt at this interface.

Bring your own VPS (Hetzner, DigitalOcean, whatever)
It handles Kamal, Traefik, SSL, monitoring
The interface is conversation — web chat or MCP server in Claude Code

No Kubernetes. No YAML manifests. No 47-screen dashboards.

Just say what you want.

I might be wrong. Maybe solo devs actually love clicking through Render’s UI. Maybe the Kubernetes complexity is worth it for everyone.

But I don’t think so. I think the right answer for one person running one to three apps is radically simpler than what we have today.

vmkit.dev if you want to follow along.

The uncomfortable truth

I’m not anti-Kubernetes. I’m anti-complexity-for-its-own-sake.

K8s is a tool. An incredibly powerful one. But tools have contexts where they make sense and contexts where they don’t.

Solo dev shipping a SaaS? You don’t need pod autoscaling. You need deploys that work and a way to debug when they don’t.

That’s the bet.

]]>

Why Your AI Wakes Up Every Morning With No Memory (And how to fix it)

Lakshmi Narasimhan — Tue, 11 Nov 2025 00:00:00 +0000

I was two weeks into a gnarly refactor when it happened.

Claude and I had been pair programming on an authentication system—tracking down race conditions, filing away “fix this later” issues, building up this rich context about why we made certain decisions. RS256 instead of HS256 for key rotation. Session middleware patterns. The whole architecture was in our shared understanding.

Then I hit compaction.

I came back the next day, opened a new Claude session, and asked: “Where did we leave off?”

Claude: “I don’t have information about previous sessions in my context.”

All of it. Gone.

The discovered bugs. The architectural decisions. The “by the way, we should fix this” notes. Everything we’d built up over dozens of hours—evaporated.

I spent 30 minutes re-explaining what we’d been working on. And even then, I couldn’t remember all the issues Claude had surfaced. How many edge cases had we found? Which ones were critical? What was blocking what?

This is what I call theamnesia problem. And it’s not just annoying—it’s a fundamental limitation of how we work with AI agents.

TheTODO.md Trap

So I did what everyone does: I created aTODO.md file.

## TODO
- [ ] Add rate limiting to login endpoint
- [ ] Improve password hashing
- [ ] Fix email validation
- [ ] Build dashboard (depends on auth being done)

Seemed reasonable. Every project has a TODO list, right?

Three days later, it was already a graveyard.

Half the items were done but still unchecked. A quarter were outdated. New issues Claude discovered during implementation? Lost in chat history. Dependencies? I had “(depends on auth being done)” in a parenthetical. Good luck having Claude parse that reliably after compaction.

Steve Yegge calls these “swamps of rotten half-implemented plans.” He’s right.

Here’s why markdown TODOs fail with AI agents:

They become stale instantly - You finish a task, forget to update the markdown. The agent reads it, doesn’t know what’s actually done.

No dependency tracking - Can I start the dashboard? Is auth done? The agent has to guess.

Context evaporates - “Fix email validation” tells you nothing. Which email? Where? What’s broken? Why does it matter? After compaction, this line is worthless.

Agents can’t use them reliably - Claude reads the whole list, can’t tell what’s ready to work on, and often just… ignores it.

ThatTODO.md file? After compaction, it’s all you have. And it’s not enough.

Enter Beads: A Memory System Built for Agents

That’s when I foundbeads.

Steve Yegge built it specifically to solve the amnesia problem. It’s an issue tracker, but not like Jira or Linear. It’sbuilt for AI agents, not humans.

Here’s the breakthrough:You don’t manage beads. Claude does.

I ranbd init in my project. That’s it. One command.

Then I just… kept working with Claude. And something wild happened.

Claude started filing issues automatically. Not when I asked—just when it discovered something worth remembering. While implementing the auth system, Claude would say:

“I notice there’s a potential race condition in the session handler. Filing it:

bd create ‘Fix race condition in user session’ --deps discovered-from:myapp-a4f2 -p 0“

It wasn’t asking permission. It wasn’t waiting for me to maintain a TODO list.It was building its own memory.

At the end of each session, Claude would update the notes on active issues:

bd update myapp-a4f2 --notes “COMPLETED: JWT login with RS256
KEY DECISION: Chose RS256 over HS256 for key rotation capability
IN PROGRESS: Session middleware
BLOCKERS: Need decision on token expiry (1hr vs 24hr trade-off)
NEXT: Implement refresh endpoint after expiry decision”

This wasn’t for me. This wasfor future-Claude.

The Compaction survival Test

The next day, I hit compaction again. New session, empty context.

Me: “What should we work on?”

Claude (without me asking, automatically ran these commands):

bd ready --json
bd list --status in_progress --json
bd show myapp-a4f2

Then replied:

“I can see we’re working on authentication (myapp-a4f2). Last session we completed the JWT login flow, decided to use RS256 for key rotation capability, and got blocked waiting for your decision on token expiry time.

The trade-off is: 1-hour tokens (better security, worse UX) vs 24-hour tokens (better UX, but wider attack window if tokens are compromised). What’s your preference?

Also, I’ve filed 3 issues we discovered along the way:

Rate limiting for login endpoint (P0)
Bcrypt rounds for password hashing (P1)
Race condition in session handler (P0)

Should we resolve the token expiry question first, or tackle one of the P0 issues?”

The conversation history was gone. But Claude had full context.

Every decision. Every discovered bug. Every blocker. Every trade-off. Right there.

No re-explaining. No “wait, what were we doing?” No hunting through old conversations.

This is what beads does.

TodoWrite vs Beads: Two Memory Systems

Here’s where people get confused. Claude actually hastwo memory systems, and they serve different purposes.

TodoWrite: Working Memory (This Hour)

TodoWrite is Claude’sscratch pad for the current session:

✓ [completed] Implement login endpoint
→ [in_progress] Add password hashing
[pending] Create session middleware

It shows you real-time progress. Gets marked complete as work happens.Disappears when the session ends.

Perfect for: “What’s Claude doing right now?”

Beads: Long-Term Memory (This Week/Month)

Beads is Claude’sepisodic memory across sessions:

bd show myapp-a4f2
Notes: “COMPLETED: Login with bcrypt (12 rounds)
KEY DECISION: JWT (not sessions) for stateless auth
IN PROGRESS: Session middleware
NEXT: Need input on token expiry (1hr vs 24hr)”

Survives compaction. Captures meaning, not just tasks.Persists across all sessions.

Perfect for: “What happened last week? What decisions were made?”

The Handoff Pattern

Session start: Claude reads bead notes → creates TodoWrite items for immediate work
During work: TodoWrite gets marked complete
Reach milestone: Claude updates bead notes with outcomes + context
Session end: TodoWrite disappears, bead survives with enriched notes

After compaction: TodoWrite is gone forever. Bead notes reconstruct everything.

The Magic: Dependencies That Prevent Mistakes

This is where beads gets brilliant. It supports four relationship types:

1.`blocks`- Hard Blocker

bd create “Build user dashboard” -p 1
# Created myapp-e3f7
bd create “Implement authentication” -p 0
# Created myapp-g2h9
bd dep add myapp-e3f7 myapp-g2h9
# → “myapp-g2h9 blocks myapp-e3f7”

Now dashboard won’t show inbd ready until auth is closed. Claudecan’t accidentally start building the dashboard before auth exists.

2.`discovered-from`- The Audit Trail

This is the agent’s secret weapon:

# Claude finds bug B while implementing feature A
bd create “Fix memory leak in session handler” \
--deps discovered-from:myapp-a4f2 -p 0

Creates an audit trail of how work was found. Those “oh by the way” issues Claude mentions? They now get filed permanently, linked to context.

After a week of work, you have anautomatically maintained discovery backlog. Prioritized. Linked. Ready to tackle.

3.`parent-child`- Hierarchy

bd create “Epic: Authentication system” -t epic
# Created myapp-j4k2
bd create “Add OAuth” --parent myapp-j4k2
# Created myapp-l8m1 (auto-linked)

Good for breaking down large features.

4.`related`- Soft Connection

bd dep add myapp-b7c3 myapp-d1e8 -t related
# “These touch the same code but don’t block each other”

What You Actually Do (Almost Nothing)

Your workflow:

One-time setup:

cd your-project
bd init

Done. That’s it.

Work with Claude normally:

“Let’s build user authentication”

Claude automatically:

Creates issues as work emerges
Tracks dependencies
Updates notes at milestones
Files discovered work with proper links
Checks ready work at session start

You just work. The memory management happens in the background.

When you DO interact with beads (rarely):

# Weekly review
bd stats
# Check what’s blocked
bd blocked
# Context restore after time away
bd show myapp-a4f2

The agent does the rest.

Why Claude Loves It

The most interesting thing about beads isn’t the technology. It’show Claude uses it.

Claude’s behavior changes:

1. Proactive filing: Claude files issues without being asked. “I notice X could be improved. Filing:bd create...“

2. Better planning: Claude uses dependencies to think through work order before starting.

3. Context awareness: Claude references past decisions from bead notes. “Last session we decided to use RS256 because…”

4. Discovery tracking: Claude treats discovered work as first-class, not throwaways.

Why? Because beads is built for how Claude actually works:

Structured data (JSON)
Clear state (open/in_progress/closed)
Explicit relationships (dependencies)
Queryable memory (show me what’s ready)

It’s not forcing Claude into a human workflow. It’s giving Claude the database it naturally wants.

When Beads is Overkill

Not every task needs beads. Use this test:

Use Beads when:

Work spans multiple sessions
You might hit compaction before finishing
There are dependencies or blockers
You’re discovering related work along the way
You need to resume after time away

Example: “Build authentication system” (multi-day, many parts)

Use TodoWrite when:

Work completes in this session
It’s a simple linear checklist
All context is in the conversation
No dependencies or discovery

Example: “Refactor this 200-line file” (done in an hour)

The test: “Will I need this context in 2 weeks?”

Yes → Beads
No → TodoWrite

The Git Sync: How It Works Across Machines

Beads stores everything in two places:

.beads/beads.db - Local SQLite (fast queries)
.beads/issues.jsonl - Git-versioned JSONL (syncs across machines)

On your desktop:

bd create “New issue”
# → SQLite write (instant)
# → After 5 seconds, exports to JSONL
# → Git commit with your code changes

On your laptop:

git pull
# → JSONL updates
# → bd auto-imports (newer than local DB)
# → SQLite now has the issue

You get:

Fast local operations (SQLite, <100ms)
Git versioning (full audit trail)
Multi-machine sync (JSONL)
Offline support (no server)

It’s a distributed database… that’s just files in git.

Memory as Infrastructure

We’re at this weird moment where AI coding agents are incredibly capable but also incredibly forgetful.

We expect them to remember complex multi-week projects, track dozens of discovered issues, maintain perfect context across compaction—but we give them… markdown files.

Beads doesn’t make agents smarter. It makes them less forgetful.

And honestly? That might be more important.

Because the hardest part of any project isn’t writing code. It’snot losing track of what needs to be written.

Beads gives your agent:

Memory that survives compaction
A discovery backlog that doesn’t evaporate
A dependency graph that prevents mistakes

And you barely have to do anything. Install it, initialize it, let Claude manage it.

The agent handles the rest.

Get started:

GitHub:steveyegge/beads
Quick start:bd init in your project
Let Claude do the rest

Key commands (mostly for reference—Claude uses these automatically):

bd init # One-time setup
bd ready # What’s ready? (Claude checks this)
bd show  # Issue details (Claude reads notes)
bd stats # Weekly review (you use this)
bd blocked # What’s stuck?

Give your agent a memory. See what happens.

]]>

I Watched AI Generate a Perfect Todo App in 3 Minutes. Then I Spent 3 Days Fixing It.

Lakshmi Narasimhan — Fri, 07 Nov 2025 00:00:00 +0000

Every AI coding tool demo starts the same way.

“Build me a todo app.”

Four words. Maybe ten seconds of typing. Then you sit back and watch the magic: files appear, databases materialize, endpoints generate themselves. The AI spins up authentication, adds a sleek frontend, writes tests. Three minutes later, you have a working application.

It’s impressive. It’s seductive. And for production software you’ll maintain for years, it’s a starting point at best—not a solution.

The Demo That Sells vs. The Code You Ship

I’ve spent five months deep in AI coding tools—Claude Code, claude-flow, and everything in between. I’ve watched hundreds of demos. I’ve read the marketing. And I’ve built actual production SaaS applications.

Here’s what the demos won’t tell you: that three-minute todo app works because it makes a thousand architectural decisions you never specified. And the moment your requirements diverge from those invisible assumptions, the whole thing falls apart.

Let me show you what I mean.

The Eight Decisions That Actually Matter

When you say “build me a todo app,” you think you’re giving clear instructions. But try building real production software and you’ll immediately hit these questions:

1. JWT Claims Structure

What exact fields go in your JWT payload?
Do you store roles as an array or a single string?
Where do permissions live? In the token? In the database?
Do you include user metadata or just an ID?

The demo picks one. It might not be the one you need. And changing it later? That’s not a refactor. That’s rearchitecting your entire auth system.

2. Token Rotation

15-minute access tokens with 7-day refresh tokens?
Refresh token rotation on every use?
Where do you store refresh tokens—database, Redis, or in-memory?
httpOnly cookies or localStorage?

The demo makes a choice. You won’t know what it chose until you’re debugging your third session timeout bug in production.

3. UI Library

shadcn/ui? Material-UI? Chakra? Ant Design? Headless UI?
Tailwind CSS or CSS-in-JS?
Which component patterns?

“Use a modern UI library” means nothing. I needed shadcn/ui specifically because it works with my design system, ships minimal JavaScript, and uses Tailwind. The demo gave me Material-UI. That’s not a theme change—that’s rebuilding the entire frontend.

4. Stripe Integration

Checkout flow or Payment Intents?
Subscription model or one-time payments?
Customer portal or custom UI?
Which webhooks do you handle?

The difference isn’t cosmetic. Checkout and Payment Intents are architecturally different. Choosing wrong means rewriting your entire billing integration.

5. Email Provider

SendGrid? Resend? Postmark? AWS SES?
Template system?
Transactional vs. marketing?

Each provider has different APIs, rate limits, pricing models, and deliverability characteristics. “Add email notifications” doesn’t specify any of this.

6. Database ORM

Prisma? Drizzle? TypeORM? Kysely?
Type generation approach?
Migration strategy?

Your ORM choice affects type safety, migration workflows, query performance, and deployment strategy. It’s not swappable. It’s foundational.

7. Testing Framework

Vitest? Jest? Mocha?
Supertest for integration tests?
What coverage target?

The testing framework dictates how you structure tests, handle mocks, and integrate with CI/CD. Changing it later means rewriting every test.

8. Deployment Target

Vercel? AWS? Docker compose? Railway?
What Vercel-specific features do you need?
Environment variable strategy?
Database hosting (Neon? Supabase? RDS?)?

Deployment isn’t the last step. It shapes your entire architecture—serverless vs. long-running, filesystem access, background jobs, caching strategies.

The “Just Refactor It” Myth

When I point this out, the response is always: “Just refactor what the AI generated.”

Have you actually tried this?

Swapping Prisma for Drizzle isn’t a find-and-replace operation. It means:

Rewriting your schema in a different DSL
Changing how you handle migrations
Updating every database query
Modifying your type generation
Adjusting your seeding scripts
Updating your testing setup

We’re not talking about an afternoon. We’re talking about days of work. And that’s for ONE of these eight decisions.

Change the ORM, the UI library, and the auth token structure? You’re not refactoring. You’re rebuilding.

What “Build Me an App” Actually Produces

Here’s the brutal truth: autonomous AI tools generate generic boilerplate that matches their training data’s most common patterns.

They give you:

Whatever stack is most popular on GitHub
Whatever patterns appear most in tutorials
Whatever architecture is easiest to generate

They don’t give you:

Your company’s conventions
Your infrastructure constraints
Your team’s expertise
Your product’s specific requirements

The demo works because demos don’t have requirements. Real projects die in the gap between “an app” and “our app.”

Why This Matters for Production Code

If you’re at a big company with a team of 20 engineers, maybe you can absorb the rebuild cost. You have engineering hours to burn. You have people to maintain legacy code while others refactor.

Most of us don’t have that luxury.

Whether you’re building solo, on a small team, or shipping client work, you’re living with every architectural decision for years. You can’t afford to spend three days ripping out Material-UI because an autonomous tool decided that’s what “modern UI library” meant. You can’t rebuild your auth system because the JWT structure doesn’t match your API contracts. You can’t rewrite billing integration because the tool guessed Checkout when you needed Payment Intents.

Wrong architectural decisions compound. When you’re responsible for maintaining the code—whether that’s yourself, a small team, or a client relationship—you need to understand and own those decisions.

That’s why production code requires control, not autonomy.

The Interactive Alternative

Compare that to working with Claude Code:

Me: “Add authentication to this project.”

Claude Code: “I can help with that. A few questions:

JWT or session-based auth?
If JWT, what should the token payload include?
Where should refresh tokens be stored?
What’s your refresh token rotation strategy?”

Me: “JWT. Payload should have userId, email, roles as an array, and permissions as a nested object. Refresh tokens in database with rotation on every use. 15-minute access, 7-day refresh. httpOnly cookies.”

Claude Code: “Got it. I’ll implement that exactly.”

The specification happened through dialogue. I clarified the architectural decisions before any code was written. The AI generated exactly what I specified, not what it guessed I might want.

When the auth system is running in production six months later and I need to debug a token issue, I understand every decision because I made every decision. I’m not reverse-engineering someone else’s assumptions. I’m working with my own architecture.

When Autonomy Actually Works

Autonomy isn’t wrong—it’s just context-dependent. There are places where “just handle it” is absolutely the right answer:

README generation: Standard markdown structure is fine
ESLint configuration: Default configs work for most cases
.gitignore files: Use the templates
Boilerplate CRUD endpoints: If they follow established patterns exactly
Prototypes you’ll throw away: Exploration where decisions don’t matter yet

These are low-stakes decisions with high standardization. Getting them “wrong” doesn’t cascade. You can change them later without rebuilding your application. Single-prompt generation shines here.

But authentication? Database schema? Tech stack? These are high-stakes, foundational decisions with cascading effects. This is where precision matters and guesswork fails.

The Autonomy Illusion

Here’s what the AI tool marketing doesn’t tell you:

More agents doesn’t mean better code. It means less control.

Sophisticated orchestration doesn’t mean better results. It means more complexity hiding the same specification problem.

“Just describe what you want” doesn’t work when architectural decisions require precision that natural language can’t provide.

I tested claude-flow—a sophisticated multi-agent system with 10+ agent templates, health monitoring, auto-scaling, 3-tier memory, and 60+ task types. Impressive infrastructure. But it still runs on string-based specifications. When I asked for shadcn/ui, there was no type safety, no validation, no guarantee the agent would interpret “shadcn/ui” as “shadcn/ui and absolutely nothing else.”

The specification layer is still natural language. And natural language is ambiguous.

The Real Question

The question isn’t “Can AI build an app from a single prompt?”

The answer to that is yes. Absolutely. The demos prove it.

The real question is: “Can AI build YOUR app—with YOUR architecture, YOUR conventions, YOUR constraints—from a single prompt?”

The answer to that is no.

Not because the AI isn’t capable of generating code. It’s excellent at that.

But because “build me an app” leaves a thousand architectural decisions unspecified. And every one of those decisions matters when you’re shipping production software you’ll maintain for years.

What Works Instead

After five months of research, building real projects, and testing multiple tools, here’s what actually works:

Start with control:

Make architectural decisions consciously
Specify tech stack, libraries, patterns explicitly
Use interactive tools that let you clarify requirements
Review and understand what’s being generated

Move to autonomy for execution:

Once patterns are established, autonomous tools can replicate them
Use autonomy for boilerplate that follows decided patterns
Let AI handle repetition, not decision-making

Return to control for integration:

Debugging requires understanding
Maintenance requires ownership
Evolution requires knowing why decisions were made

The cycle is: design with control, execute with autonomy, integrate with control.

Not: autonomous generation followed by days of “just refactor it.”

The Real Power of AI Coding

The promise of AI coding tools isn’t “describe an app in four words and get perfect code.”

The promise is: “Make architectural decisions at the speed of thought, then have those decisions implemented flawlessly.”

Interactive AI tools let you think at the architecture level while the AI handles the implementation level. You make decisions. The AI writes code. You maintain control and understanding. The AI handles the tedious translation from intent to syntax.

That’s the real 10x improvement.

Not “build me an app” magic that produces generic boilerplate you’ll spend days rebuilding.

But the ability to say “JWT with these exact claims, refresh rotation with this lifecycle, stored in httpOnly cookies” and get exactly that. First try. No guessing. No rebuilding.

The Bottom Line

If you’re building serious software—production SaaS, client projects, anything you’ll maintain beyond next week—you need to understand what you’re building.

Autonomous tools that guess at your architecture don’t save time if you spend days fixing wrong assumptions.

Code you don’t understand becomes a liability the moment something breaks.

Decisions you never made can’t evolve with your requirements.

Control isn’t about micromanaging the AI. It’s about owning the architecture of software you’re responsible for maintaining.

The demos are impressive. The marketing is seductive. The promise of “just describe it” is tempting—and genuinely useful for the right contexts.

But for production software with real requirements, real constraints, and real consequences? Interactive tools that let you specify precisely what you need will outperform autonomous guesswork every time.

Building production SaaS as a solo technical founder? I write about AI tools, architectural decisions, and shipping solo. Subscribe to get the next essay.

Be skeptical of demos. Demand control. Ship code you understand.

]]>

The Junior Dev Paradox: We’re Speed-Running Past the Tutorial

Lakshmi Narasimhan — Sat, 01 Nov 2025 00:00:00 +0000

So here’s a fun thought experiment: What happens when an entire generation of developers learns to code by never actually learning to code?

I don’t mean that in the gatekeepy “back in my day we walked uphill both ways in assembly language” sense. I mean it literally. Right now, today, someone is getting their first junior dev job having built an impressive portfolio of projects they couldn’t debug if their life depended on it.

And honestly? I’m not sure if that’s a problem or just… different.

The Thing Nobody Wants to Say Out Loud

We—the developers who learned pre-AI—spent an ungodly amount of time doing things that, in retrospect, might have been pointless. Memorizing syntax. Reading documentation cover to cover because Stack Overflow didn’t have the answer. Spending three hours debugging only to find a missing semicolon. Writing the same boilerplate for the thousandth time because that’s just how you learned patterns.

That grind built something, though. Call it intuition. Call it muscle memory. Call it the ability to look at a stack trace and justknow where the problem is because you’ve seen that exact error forty times before. We developed pattern recognition through sheer repetitive exposure, like some kind of coding Stockholm syndrome.

Junior devs today can skip all of that. They can describe what they want and watch Claude or Copilot generate it. They can ship features on day one that would’ve taken us weeks to build as juniors. They can contribute to complex codebases without understanding half of what’s happening under the hood.

Which is either the most amazing democratization of technical skills in history, or we’re a generation of developers who are one AI outage away from complete helplessness.

Probably both.

What We Might Be Losing

Here’s what I wonder about:

Can you develop debugging intuition if AI catches most of your bugs?

Can you build system design sense if you’ve never had to architect something from scratch?

Can you really understandwhy something works if you’ve only ever describedwhat you want it to do?

The old way of learning had a built-in forcing function. Youhad to understand data structures because you couldn’t implement anything without them. Youhad to read error messages carefully because that was your only clue. Youhad to develop mental models of how systems work because there was no AI to abstract it away.

It was inefficient as hell. It was also weirdly effective.

Now we’ve got junior devs who can ship impressive features but might struggle to explain what a hash table is or why their O(n^2) solution is melting production. They know how to make things work; they just don’t always knowwhy they work orhow to fix them when they don’t.

And before someone shows up in the comments with “well actually, they can just ask AI to debug it”—sure, until they can’t. Until the AI doesn’t understand the problem. Until the codebase is too complex or too weird or too legacy. Until, I don’t know,Claude Code goes down for five hours and suddenly you’re naked without your safety net.

What We Might Be Gaining

But here’s the flip side: maybe we’re romanticizing the struggle.

Junior devs today are learning different skills. They’re getting good at prompt engineering, at articulating problems clearly, at evaluating AI-generated solutions. They’re exposed to more patterns, more codebases, more architectural approaches in their first year than we saw in five.

They’re also spending less time on tedious nonsense. Nobody needs to memorize the exact syntax for array methods or spend a week setting up a development environment. That time gets redirected to actually building things, to experimenting, to shipping.

And maybe—maybe—the fundamentals that matter are changing. Maybe understanding how to architect a system is more valuable than knowing how to implement every piece of it. Maybe code review skills and the ability to verify solutions matter more than the ability to generate them from scratch.

Maybe the fact that they can be productive on day one is a feature, not a bug.

The Real Problem: The Copy-Paste Generation

The actual risk isn’t that junior devs are using AI. It’s that some of them are using it as a crutch instead of a catalyst.

There’s a difference between “I don’t understand this, let me ask AI to explain it” and “I don’t understand this, so I’ll just copy-paste whatever AI gives me and hope it works.” One is learning accelerated by AI. The other is… well, it’s not learning at all.

We’re going to end up with a split: junior devs who use AI to move faster while still building understanding, and junior devs who are entirely dependent on AI to function. Thefirst group will be terrifyingly productive. The second group is going to hit a wall the moment they encounter a problem AI can’t solve.

And here’s the uncomfortable part: it’s getting harder to tell them apart during hiring. Both can build impressive portfolios. Both can ship features. The difference only shows up when things break, when requirements get weird, when they need to dig into a gnarly legacy codebase that AI doesn’t understand.

Some Half-Baked Solutions

So what do we do about this? I don’t have perfect answers, but here are some thoughts:

For junior devs: Choose the harder path sometimes. Deliberately code without AI for practice. Build a project from scratch where you have to figure everything out manually. Read source code, not just documentation. When AI generates something, understandwhy it works before moving on. Treat AI as a tutor who’s always available, not a replacement for thinking.

For seniors and mentors: Stop assuming junior devs have the same foundation you did. Be explicit about the “why” behind decisions. Create space for questions that might sound basic. Do code reviews that focus on understanding, not just functionality. Maybe assign “AI-free” tasks occasionally, not as hazing, but as skill-building.

For companies: Normalize “I don’t know, let me learn this properly” instead of “ship at all costs.” Allocate time for learning, not just velocity. Celebrate understanding, not just output. Maybe reconsider how you evaluate technical skills in interviews—you’re not just testing if someone can code, you’re testing if they can think.

For education: Stop pretending AI doesn’t exist. Teach people how to use it effectively, not how to avoid it. But also teach debugging, system design, and foundational concepts. The goal isn’t to reject AI; it’s to use it wisely while building real understanding.

The Uncomfortable Non-Conclusion

Here’s the truth: We’re all figuring this out in real-time. Every generation of developers has had this conversation in some form—about IDEs, about Stack Overflow, about frameworks that abstract away complexity. The old guard always worries the new guard doesn’t know “the fundamentals.”

Sometimes they’re right. Sometimes they’re just old. Also, “the fundamentals” is an ever shifting goal post.

I don’t know which this is yet. Ask me in five years when we see how this generation of AI-native developers performs at scale. Ask me when we see if they hit a ceiling or if they just built their skills differently.

What I do know is this: AI-assisted coding isn’t going away. The barrier to building software has collapsed. Junior devs can be productive faster than ever. And somewhere in there, we need to figure out how to preserve the understanding that makes you not just productive, but genuinely good at this job.

Because the best developers aren’t the ones who can generate code the fastest. They’re the ones who can look at a complex system, understand how it works, figure out why it’s broken, and know how to fix it. Whether you learned that through years of painful debugging or through AI-accelerated practice doesn’t really matter.

As long as you actually learned it.

]]>

How to Secure Your Vibe-Coded Project (Before It Secures You)

Lakshmi Narasimhan — Thu, 30 Oct 2025 00:00:00 +0000

Most developers ship AI-generated code without security audits. Here’s how to catch vulnerabilities before they become breaches—without hiring a security team.

You’re moving fast. AI is writing code. You’re shipping features daily. But speed creates blind spots—and security vulnerabilities love blind spots.

I’ve watched developers ship vibe-coded projects thatworked but had SQL injection holes, exposed API keys, and broken authentication. Not because they were careless—because they were solo and didn’t have time to review every line AI generated.

Here’s the truth: you can’t manually audit everything. But you can automate the audit.

Why Incremental Reviews Aren’t Enough

Tools like/security-review catch issues in pull requests. That’s great for new code. But what about legacy code? Configurations that haven’t been touched in months? Dependencies with known CVEs?

Incremental reviews are daily vitamins. Full audits are annual physicals. You need both.

What a Security Audit Should Cover

A comprehensive security audit doesn’t just scan for SQL injection. It evaluates your entire attack surface against industry-standard frameworks:

OWASP Top 10 2021 — Broken access control, cryptographic failures, injection attacks
OWASP API Security Top 10 2023 — Broken object-level authorization, mass assignment, security misconfigurations
Cloud & Infrastructure Security — Misconfigured S3 buckets, exposed environment variables, weak IAM policies
Supply Chain Security — Vulnerable dependencies, outdated packages, insecure third-party integrations

The Four Layers of a Proper Audit

1. Reconnaissance: Understanding Your Stack

Before auditing, the tool needs to know what you’re running: Node.js? Python? Docker? Postgres or MongoDB? Framework: Express, FastAPI, Next.js?

This determines which vulnerability patterns to look for. SQL injection matters in Postgres apps. NoSQL injection matters in MongoDB apps. Different stacks, different attack vectors.

2. Code Analysis: Finding Hidden Vulnerabilities

This is where the audit scans every file—not just recent changes—looking for patterns that indicate security issues:

User input flowing directly into database queries (SQL/NoSQL injection risk)
Hardcoded secrets or credentials in code
Missing authentication checks on sensitive endpoints
Weak cryptography (MD5, SHA1) for passwords or tokens
Overly permissive CORS policies
Exposed debug endpoints in production

javascript

// BAD: SQL injection vulnerabilityapp.get('/user',(req,res)=>{constuserId=req.query.id;db.query(`SELECT * FROM users WHERE id =${userId}`);// Dangerous!});// GOOD: Parameterized queryapp.get('/user',(req,res)=>{constuserId=req.query.id;db.query('SELECT * FROM users WHERE id = ?',[userId]);});

3. Configuration Review: Infrastructure Security

Code vulnerabilities are obvious. Configuration vulnerabilities are subtle:

Are environment variables properly isolated?
Do Docker containers run as root? (They shouldn’t.)
Are cloud storage buckets publicly accessible?
Is TLS enforced on all endpoints?
Are rate limits configured to prevent abuse?

These issues don’t show up in code reviews. They live in config files, environment variables, and infrastructure settings.

4. Dependency Scanning: Supply Chain Vulnerabilities

Your code might be secure, but your dependencies might not be. Audit tools scanpackage.json,requirements.txt, andgo.mod against databases of known CVEs (Common Vulnerabilities and Exposures).

If a package has a critical security flaw, you’ll know—and you’ll get specific remediation guidance (upgrade to version X.Y.Z).

How to Run an Effective Security Audit

If you’re using Claude Code, you can install a/security-audit slash command that automates this entire process. It performs reconnaissance, analyzes every file, reviews configurations, and scans dependencies—generating an actionable report.

The key difference from incremental reviews:it catches everything—even vulnerabilities in code you wrote months ago and haven’t touched since.

Installation

Global installation (available across all projects):

bash

mkdir -p ~/.claude/commandscurl -o ~/.claude/commands/security-audit.md https://example.com/security-audit.md

Project-specific installation (for team collaboration):

bash

mkdir -p .claude/commandscurl -o .claude/commands/security-audit.md https://example.com/security-audit.md

Running the Audit

From Claude Code, type:

bash

/security-audit

The tool will:

Identify your tech stack
Scan every file for vulnerability patterns
Review configurations (Docker, env files, cloud settings)
Check dependencies against CVE databases
Generate a prioritized report with specific remediation steps

When to Audit

Use/security-review during daily development for fast feedback on pull requests.

Use/security-audit monthly or quarterly for comprehensive assessment—especially before major releases or when adding new features that touch sensitive data.

Think of them as complementary: one catches new issues, the other catches everything.

The 20% That Prevents 80% of Breaches

Most security incidents aren’t sophisticated zero-days. They’re basic misconfigurations: exposed API keys, missing authentication, unpatched dependencies.

A security audit catches these low-hanging issues—the 20% of configs that prevent 80% of breaches. You’re not trying to build Fort Knox. You’re trying to avoid being the easy target.

Security is a Discipline, Not a Feature

When you’re running solo, security feels like something you’ll “get to later.” But later turns into never—until something breaks.

Automate the audit. Run it regularly. Fix what it finds.

You don’t need a security team. You need visibility into what’s broken and specific guidance on how to fix it. That’s what a proper security audit provides.

Because the best time to find vulnerabilities is before attackers do.

]]>

If I Were Starting a New SaaS Today, I'd Do This

Lakshmi Narasimhan — Wed, 15 Oct 2025 00:00:00 +0000

Most SaaS projects fail because founders spend weeks building scaffolding instead of features. Here’s how to skip the boilerplate and ship fast.

I’ve built half a dozen SaaS products. Some succeeded. Most failed. But the failures taught me something critical:ideas don’t die from competition—they die from delayed launches.

You lose weeks setting up databases, authentication, APIs, file uploads, and admin panels before writing a single line of actual product code. By the time you’re ready to ship, momentum is gone.

If I were starting today, I’d skip all that. I’d use Supabase—and I’d ship an MVP in days, not months.

The Problem with “Building from Scratch”

Building foundations feels productive. You’re writing code, making decisions, setting up infrastructure. But you’re not building anything users can touch.

Auth alone consumes days: password hashing, session management, password resets, email verification. Then you need database migrations, API routes, input validation, error handling. Before you know it, you’ve burned two weeks on scaffolding.

That’s two weeks you could’ve spent validating whether anyone actually wants what you’re building.

Why Supabase Changes Everything

Supabase isn’t just a database. It’s a complete backend—authentication, storage, real-time updates, edge functions—packaged as a single platform. And unlike Firebase, it’s built on PostgreSQL, so you’re not locked into proprietary tech.

PostgreSQL Foundation

Every table you create automatically gets REST and GraphQL APIs. No backend needed. Query directly from your frontend with row-level security enforcing permissions at the database layer.

javascript

// Fetch user's tasks directly from the frontendconst{data,error}=awaitsupabase.from('tasks').select('*').eq('user_id',userId);

You still get full PostgreSQL power: triggers, extensions, stored procedures, joins, indexes. It’s not a toy database—it’s enterprise-grade Postgres with a developer experience that doesn’t suck.

Authentication That Just Works

Built-in support for email/password, magic links, OTPs, OAuth (Google, GitHub, etc.), and custom SSO. User records live in your Postgres schema. Add custom fields. Create relationships. No vendor lock-in.

javascript

// Sign up with email/passwordconst{user,error}=awaitsupabase.auth.signUp({email:'user@example.com',password:'secure-password'});// Magic link (passwordless)awaitsupabase.auth.signInWithOtp({email:'user@example.com'});

No JWT libraries. No session stores. No password reset flows. It’s handled. You write product code.

Real-Time Updates Without Redis

WebSocket-based subscriptions give you instant updates on table changes. No message brokers. No Kafka. No Redis pub/sub.

javascript

// Subscribe to new messagessupabase.channel('messages').on('postgres_changes',{event:'INSERT',schema:'public',table:'messages'},(payload)=>{console.log('New message:',payload.new);}).subscribe();

Insert a row in your messages table? All connected clients receive it instantly. Build chat, notifications, live dashboards—without standing up infrastructure.

Row-Level Security

Database-level authorization policies replace entire backend authorization layers. One policy line defines who can access what.

sql

-- Users can only see their own tasksCREATEPOLICY"Users see own tasks"ONtasksFORSELECTUSING(auth.uid()=user_id);

Policies compose. Multi-tenant? Add tenant_id checks. Admin override? Add role conditions. Security moves from scattered backend checks to centralized, auditable rules.

Storage & Edge Functions

Native file upload handling with access rules compatible with row-level security. TypeScript-based edge functions deploy in seconds for webhooks, scheduled jobs, or integrations.

javascript

// Upload file with automatic access controlconst{data,error}=awaitsupabase.storage.from('avatars').upload(`${userId}/avatar.png`,file);// Edge function for webhook processingimport{serve}from'https://deno.land/std/http/server.ts'serve(async(req)=>{constpayload=awaitreq.json();// Process webhookreturnnewResponse('OK',{status:200});});

The Developer Experience You Deserve

The CLI, dashboard, SQL editor, and APIs feel cohesive. You’re not juggling five different tools with five different authentication methods. Everything integrates.

Need to see your database? Open the dashboard. Want to test a query? Use the SQL editor. Ready to deploy a function?supabase functions deploy. It just works.

Open Source & Portability

Unlike Firebase, Supabase runs self-hosted via Docker Compose. Start on their hosted platform. Move to self-hosted if you outgrow it. Same codebase. Same developer experience.

You’re not locked in. Your data is Postgres. Your auth is Postgres. Your files are S3-compatible storage. If Supabase disappears tomorrow, you can migrate. Try doing that with Firebase.

Ship Fast, Own Your Stack, Avoid Unnecessary Complexity

This is the indie developer playbook: start small, ship fast, scale naturally. Supabase embodies that philosophy.

You’re not choosing between a custom backend and a proprietary platform. Supabase sits in the middle—powerful enough for serious applications, simple enough to start with one table and an auth flow.

If I were starting a SaaS today, I’d skip the scaffolding. I’d use Supabase. And I’d ship in days—not weeks.

Because the best way to validate an idea isn’t to build perfect infrastructure. It’s to put something in front of users and learn whether they care.

Supabase gets you there faster. And when you’re running solo, speed is everything.

]]>

AWS Is Overrated

Lakshmi Narasimhan — Sat, 11 Oct 2025 00:00:00 +0000

If you’re an indie dev building your first SaaS, AWS is not your friend.

It’s a maze of services, dashboards, and acronyms pretending to make you productive while quietly billing you for curiosity.

Sure, it’s “the industry standard.” But here’s the thing: you’re not Netflix. You’re not Stripe. You don’t need fifteen managed services to ship an MVP. You just need one working prototype in front of users.

When I started shipping my own SaaS projects, I defaulted to AWS too. Everyone said it was the “serious” choice. I spun up EC2s, tinkered with VPCs, IAM roles, and CloudWatch dashboards.

Two weeks later, my app still wasn’t live. But my bill was.

That’s when it clicked. AWS is optimized forscale, notspeed. It’s designed for teams with DevOps pipelines, budgets, and compliance officers. Indie devs have none of those.

Here’s the real problem:

AWS makes youfeel productive because it has a service for everything.

But it slows you down because you end upassembling infrastructure instead of shipping software.

You’re busy wiring VPCs while your users are waiting for a login page.

If you’re building your first SaaS, you’re better off with:

Render orFly.io for fast deploys.
Railway,Supabase if you love simplicity.
DigitalOcean app platform
Or even your ownK3s box on a $30 DigitalOcean droplet if you like to tinker.(More on this in future posts)

You’ll have full control, predictable costs, and a deploy story you can explain in a single sentence.

That’s what matters at your stage — not five-nines availability across three regions.

AWS will always have its place. It’s incredible at running serious workloads, regulated systems, and multi-tenant platforms at scale.

But for indie devs trying to launch, learn, and iterate fast — it’soverkill.

Use the simplest stack that lets you ship.

Add complexity only when success forces you to.

Because nothing kills momentum faster than debugging IAM policies instead of building features.

TL;DR

If you’re a solo founder or small team, your advantage isn’t scale — it’s speed.

Don’t trade that away for a cloud that was never built for you.

I share one short post daily-ish for productive indie developers — how to ship faster, cheaper, and saner. Subscribe if that’s your vibe.

]]>

AI Can Build Anything—Except Product Taste

Lakshmi Narasimhan — Sat, 04 Oct 2025 00:00:00 +0000

Everyone says AI makes you “10x more productive.” I’m not sure about that. What it actually made me is… more deliberate.

When execution is basically free, the bottleneck shifts. It’s no longercan I build this? It’sshould I build this?

That sounds obvious, but most of us (me included) are terrible at it. We confuse motion for progress. And AI just cranks the treadmill speed up to 11.

Here’s what I mean.

The other weekend, I let myself play around with the idea of “AI-generated developer dashboards.” Normally, something like that would eat a week of evenings. This time, I had three versions running before breakfast: a React prototype, a Python backend spitting out metrics, and a half-decent mock landing page.

Impressive? Maybe. Useful? Not really. By Sunday night I realized I’d basically built three beautifully useless toys. Execution had been trivial. The problem was never execution—it was me chasing shiny objects.

That’s the AI paradox. It lowers the cost of building so much that the real scarcity becomestaste. Judgment. The ability to say no.

Because here’s the dark side: the opportunity cost of distraction just went up. Before, if I burned a week tinkering on something dumb, at least I learned a few low-level tricks. Now I can burn a week and end up with a full microservice, a CI/CD pipeline, and a Terraform config… for an idea that didn’t deserve any of it. Congratulations, I’ve industrialized my dead ends.

I’ve caught myself doing this with infrastructure experiments, too. AI will happily generate Kubernetes manifests, Helm charts, and CI workflows for whatever hair-brained service I throw at it. The code even looks plausible at first glance. Then I deploy it, watch it explode, and realize the whole thing never needed to exist in the first place. It’s the most polished waste of time imaginable.

And this is why restraint has suddenly become a superpower. The real work isn’t generating more; it’s filtering harder. AI will give you 50 rabbit holes before lunch. If you’re not ruthless about which one you go down, you’re just automating your own distraction.

The old mantra was “ship fast and break things.” AI makes that easier than ever. But there’s a hidden multiplier effect: fast execution with bad strategy doesn’t just fail—it failslouder. You don’t just waste time, you waste time at scale. Meanwhile, the teams with clear strategy and discipline can use the exact same tools to compound wins. Same technology, wildly different outcomes.

This is why I think “thinking” has quietly become underrated. Tinkering used to be the path to learning. Now tinkering is dangerous. You can dig a perfect hole in the wrong place faster than ever. Spending more time deciding where to dig—that’s the skill worth leveling up.

Developers don’t usually like to hear that. We want to build. But in an AI-first world, the rarest and most valuable act might be…not building. Closing the tab. Saying no to the prototype. Choosing boredom over the dopamine hit of “look what I got running.”

So no, AI didn’t make me more productive. It made me picky. It forced me to care about what I was building in the first place.

And that’s the paradox: AI made execution trivial, so the premium is now on taste, judgment, and focus.

If you can’t decide what matters, AI will happily help you drown in what doesn’t.

]]>