Last summer, I wrote a post about the hidden costs of AI-assisted development. My core argument was that heavy reliance on AI code generation creates a "meta-level leaky abstraction" - you end up with code you don't deeply understand, and when the AI is wrong or unavailable, you have nothing to fall back on. I still believe that. But my relationship with these tools has changed meaningfully since then, and I think it's worth documenting where I've landed.
What Changed
The tools got better. Not in a hand-wavy, hype-cycle way - in a practical, day-to-day way. The models are more reliable, the context windows are larger, and the ecosystem of agentic tooling around them has matured. That matters because the failure modes I was worried about in August 2025 haven't disappeared, but they've become more manageable. The gap between "what the AI produces" and "what I would have written" has narrowed enough that the cost-benefit math shifted.
The Agent
At work, we implemented an automated agent pipeline that picks up Jira tickets and turns them into pull requests with minimal human involvement in the implementation step. The flow looks like this:
flowchart TD
A["Jira Ticket (assigned to agent)"] --> B["OMP + Claude (implements changes)"]
B --> C["Branch + PR (pushed automatically)"]
C --> D["Human Review (code review)"]
D -->|Approved| E["Merge (manual)"]
D -->|Changes Needed| F["Revise (human or agent)"]
F --> C
The agent polls and picks up any assigned Jira tasks, uses OMP (Oh My Pi) with Claude to implement the changes, pushes up a branch, and opens a PR. A human reviews the code and manually merges it. Right now we're scoping it to trivial tickets - straightforward bug fixes, small feature additions, copy changes, things like that.
The Rule That Makes It Work
We have one rule that I think is doing most of the heavy lifting:
Whoever merges the AI-generated code owns it.
That means you are responsible for understanding what it does, how it fits into the broader system, and what happens if it breaks - exactly as if you had written it yourself. This isn't a suggestion or a cultural norm. It's an explicit expectation.
Our hope is that this rule addresses the core concern I raised in my original post. It should force the reviewer to deeply engage with the code. You can't rubber-stamp an AI-generated PR the way you might lazily approve a colleague's. In theory, the ownership model means the "leaky abstraction" problem doesn't disappear, but gets contained - because someone on the team is always building and maintaining understanding of what went in.
It's too early to say whether this actually holds up in practice. We'll be watching closely. If it turns out that ownership-on-merge isn't enough to maintain the depth of understanding we need, we'll have to find other ways to address it. But it's the best starting point we've found so far.
The Question I Can't Stop Thinking About
Here's the thing, though: our team is lean and top-heavy. We're experienced developers working in a domain we know deeply. The ownership rule works because we already have the institutional knowledge to evaluate what the AI produces. We can spot when something is subtly wrong because we've been in this codebase for years. We know the patterns, the edge cases, the parts of the system that look simple but aren't.
What happens when we add someone newer to the team?
Historically, junior developers build that deep understanding by doing the work - writing code, making mistakes, debugging their way through problems, and slowly building a mental model of the system. If the agent is handling the trivial tickets - the ones that used to be the on-ramp for newer developers - where does that learning come from?
You could argue the learning shifts to code review. Instead of writing the code, you're reviewing it, which still requires understanding. But I'm not sure that's equivalent. There's something about the struggle of implementation - the false starts, the refactoring, the "oh, that's why it's done this way" moments - that I don't think you get from reading a clean diff.
I don't have a good answer for this yet. Maybe the answer is that junior developers shouldn't use the agent at all until they've built sufficient context. Maybe the answer is that the nature of "junior" work changes entirely. But it's the thing I keep coming back to, and I think any team adopting these workflows seriously needs to think about it.
Where This Goes
We're planning to expand the agent to more complex tickets. I suspect we'll hit a ceiling - a complexity threshold where the ownership cost of reviewing AI-generated code exceeds the cost of just writing it yourself. Finding that line is going to be the interesting part.
What's more interesting is that the pattern is starting to spread beyond engineering. Our designers have asked about building a similar workflow using Figma's MCP integration - the idea being that an agent could pick up design tasks and produce initial mockups or component updates for a designer to review and refine. Same ownership model, different medium.
On the other end of the pipeline, we're exploring an agent that monitors Sentry, analyzes error patterns and stack traces, and automatically creates Jira tickets from its findings. If that works, you can start to see the shape of a much larger loop: errors surface automatically, get triaged into tickets, and some of those tickets get picked up and resolved by the coding agent - with humans reviewing and approving at each stage, but not necessarily initiating any of it.
That's exciting and a little unnerving in equal measure. Each individual step is fairly contained and reviewable. But the more you chain together, the more you need to trust the system as a whole - and the harder it becomes for any one person to hold the full picture in their head. Which, if you think about it, is just the leaky abstraction problem at a higher altitude.
I also think the ownership rule will need to evolve. Right now it's binary - you merge it, you own it. But as AI-generated agents touch more of the workflow - from error detection to ticket creation to implementation - we might need more structured approaches to maintaining collective understanding. Maybe that looks like mandatory architecture docs for AI-generated features, or periodic "walk the codebase" sessions, or something else entirely.
My original concerns about leaky abstractions haven't gone away. If anything, I'm more convinced that the risk is real. But I've become more optimistic that it's a manageable risk - if you're intentional about it. The tools are good enough now that refusing to use them is leaving real productivity on the table. The trick is building processes that capture the productivity without sacrificing the understanding.
The abstraction still leaks. We just got better at knowing where to put the buckets.