Six-People Shipping Like Thirty: Engineering at AI-Native Speed
Article

In the month between late January and late February 2026, Glover Labs merged over 120 pull requests. Our team at the time: three founders and three engineers.
That number isn't a vanity metric. It's a structural consequence of how we build — and we think the pattern is more interesting than the number itself, because it's a pattern that any small engineering team can adopt.
The Feedback Loop, Not the Tool
Every article about AI-assisted development focuses on the tool. Which model. Which IDE integration. Which code completion engine. This framing misses the point.
The productivity gain at Glover doesn't come from any single AI tool. It comes from the feedback loops between tools — each one's output becoming the next one's input, with minimal human translation between steps.
Here's the actual loop: Claude generates code. GitHub Actions summarize what shipped. Slack distributes context across the team. Linear tracks what's next. Gemini transcribes meeting decisions into searchable notes. Claude consumes all of this to generate the next artifact. The cycle is continuous and self-reinforcing — every tool in the chain both produces and consumes context.
Most teams use AI tools in isolation. A developer opens Copilot, gets a suggestion, accepts or rejects it, and goes back to working the way they always have. The AI is a faster autocomplete. It doesn't change the workflow. It just accelerates one step inside the existing workflow.
We reorganized the workflow itself around AI. That's the difference between a 10% productivity gain and a 3-5x one.
What AI-Native Actually Looks Like Day to Day
A few specifics, because abstractions are cheap.
Tiered model strategy. We run Claude across three tiers: Haiku for high-volume, simpler tasks like per-page code analysis (roughly 60% of our token load), Sonnet for medium-complexity work, and Opus for complex reasoning. This wasn't a theoretical decision — we were watching Anthropic charges climb past $1,100 and needed to baseline spend. Routing the right tasks to the right model tier cut costs without cutting capability.
CLAUDE.md as institutional memory. Every engineer maintains CLAUDE.md files — persistent context documents that shape how Claude Code behaves in their specific workflow. One of our engineers discovered that structuring Linear tickets as end-to-end vertical slices (rather than layer-by-layer) dramatically improved Claude's ability to generate testable code. That insight got codified into CLAUDE.md so the pattern persists across every session, for every engineer, regardless of who originally figured it out. These files accumulate the team's hard-won knowledge about how to work with AI effectively — the kind of tacit knowledge that normally lives in someone's head and walks out the door when they leave. They're also how we onboard new engineers into AI-native workflows: instead of a two-week ramp-up where someone shadows a senior dev, the CLAUDE.md files encode the senior dev's patterns directly into the tooling.
Multi-agent code generation. Our code generation engine — we call it the Artificer — uses separated action, verification, and healing agents. The action agent writes code. The verification agent checks it against a programmatic test suite. If something fails, the healing agent fixes it. This isn't a single monolithic agent session running for 200 turns and accumulating errors. It's scoped 30-50 turn sessions per work unit with test gates between them — a structural fix for the cascading failure problem that hits every team running long agent sessions.
Automated context distribution. A GitHub Action powered by Claude Haiku generates daily PR summaries and posts them to Slack. Nobody writes standup updates. Nobody reads standup updates. The team gets a daily digest of what actually shipped — auto-generated, zero overhead. We killed the standup bot that requested bulleted updates from each person. It was a ritual that produced no value once the automated summaries existed.
AI-generated intelligence briefing. A Slack bot we call Moriarty delivers automated daily intelligence to the team — scanning for industry news across agentic coding, competition, and legacy modernization. Six people can't afford to have anyone doing manual competitive research. So nobody does. The machine reads the internet and tells us what matters.
The Uncomfortable Lessons
Not everything worked. A few things we tried and killed:
We ran a JIT compilation experiment on every PR that was supposed to catch issues early. In practice, it fired on every PR update, burned tokens at scale, and produced feedback that wasn't actionable. We killed it and moved to a sub-agent approach with a review prompt focused only on critical bugs — not style, not nits, just things that would break in production.
We initially ran full codebase migrations as our test suite. Slow and expensive. Switching to smaller reference repositories (eShop, BlogEngine.NET) cut test cycle cost and time dramatically without losing signal on the things that mattered.
Engineers were submitting complete PRs for prototype work, which triggered full review cycles for code that might get thrown away. We shifted to selective draft PRs for prototypes — saving review bandwidth for production code.
These aren't interesting as individual optimizations. They're interesting as a pattern: the AI-native workflow isn't something you design once. It's something you tune continuously, killing what doesn't work and doubling down on what does. The team that ships 120 PRs in month six bears almost no resemblance to the team that shipped 40 in month one — not because the tools got better, but because the feedback loops got tighter.
Why This Matters Beyond Glover
Research is starting to catch up to what small AI-native teams are experiencing on the ground. Faros AI's productivity research found that AI-assisted engineers create 98% more PRs per person than non-AI counterparts. Google's DORA data shows AI adoption increases throughput but also increases review burden — PR review times up 91%. OpenAI shipped the Sora Android app from prototype to global launch in 28 days with a four-person team.
The pattern across all of these: small, senior teams that treat AI as a workflow redesign problem — not a tool adoption problem — are shipping at velocities that were previously impossible. The productivity multiplier isn't in the model. It's in the loop.
But the DORA data also carries a warning. That 91% increase in review time isn't a bug — it's a predictable consequence of generating more code without rethinking how review works. Teams that bolt AI onto existing workflows get more output and more overhead simultaneously. The teams pulling ahead are the ones restructuring review itself: automated summarization, sub-agent review scoped to critical bugs, draft PRs for prototype work. The throughput gain only sticks if the downstream processes can absorb it.
For Glover specifically, this matters because we're building modernization infrastructure for enterprises that have been stuck for decades. We can't afford to be slow. Our customers are banks, insurers, and government agencies running systems that process trillions of dollars. They need a vendor that ships fast, iterates in the open, and treats engineering velocity as a core product capability — not a nice-to-have.
Six people, 120+ PRs, one month. The AI isn't replacing anyone on the team. It's the reason a team this small can build something this ambitious.
