This is Part 2 of the Hermes Kanban series. If you haven’t read Part 1: Setting Up Your Agent Crew , that covers the profiles, the kanban board, and how dispatch works. Here’s what happens when you actually run it.
Every tutorial on AI code review sells the same fantasy: one pass, zero friction, instant green. That’s not how it works — and more importantly, that’s not how it should work. We ran 7 review rounds across 96 files. Round 1 found 11 issues. Round 7 found zero. The decay curve isn’t failure. It’s the signature of a system that actually checks.
Seven rounds is not a bug
If your first reaction is “seven rounds? That’s broken,” I understand. Mine was too. But the pattern isn’t perpetual failure — it’s exponential decay.
The multi-agent review loop is the feature, not the bug. Each round starts fresh because the reviewer has no memory of what you already fixed. It re-checks everything. That’s by design — no shared hallucination surface, no stale context bleeding between runs. The cost is time. The payoff is that round 7 actually means something.
Multi-agent review loops are the feature, not the bug. The decay curve — 11 → 7 → 3 → 2 → 5 → 1 → 0 — is how you know it’s working.
The decay curve is the whole point
Here’s what the reviewer found, round by round:
| Run | What the reviewer found | What we fixed |
|---|---|---|
| 1 | Stale naming conventions + broken links | 11 source files patched |
| 2 | Outdated references + broken doc links | 7 documentation files scrubbed |
| 3 | Remaining cross-reference + compliance links | 3 key files fixed |
| 4 | Stale architecture references in diagrams | 2 diagram-heavy files rewritten |
| 5 | Pre-existing legacy references in old docs | 5 older docs scrubbed |
| 6 | Last broken cross-reference | 1 link path corrected |
| 7 | Clean pass | All 96 files greenlit |
Round 3 felt like the system was broken. Round 5 felt like grinding. Round 7 felt like evidence.
The review cycle: sweep → fix → unblock, repeated until clean
Three things drive this shape:
The reviewer starts fresh every run. It doesn’t remember round 3 when it opens round 4. That seems inefficient until you realize the alternative: a reviewer with memory would inherit every false assumption and hallucination from the previous run. The reset is a feature.
Second, pre-existing docs on main inject noise. The engineer touched five files; the reviewer scanned all ninety-six. It found bugs in files the engineer never opened. You fix them anyway — they’re blockers now.
Third, the patch tool has quirks. Fuzzy matching sometimes eats the leading | in markdown table rows. Each quirk surfaces as a “new” finding until you learn to verify your patches.
Tool quirks cost rounds until you document them
The patch tool’s fuzzy matching ate the leading pipe in a markdown table row. Cost us a full review round. After that, we added a rule: after every patch on a .md file with tables, run a second pass:
It sounds pedantic until you realize the table didn’t render and the reviewer flagged it as a missing reference. That’s a full round spent on a rendering quirk.
[system_name].md from a real broken link — each one costs you a round until you write it down. The skill’s pitfalls section is where those rounds go to die.This is the kind of thing you can’t anticipate until it bites you. The fix isn’t better tooling — it’s documenting the quirk the instant it surfaces so the next project doesn’t relearn it.
The pre-sweep is your best lever
After round 5, we stopped unblocking and hoping. The pattern now is procedural:
Sweep comprehensively
Before unblocking, run a comprehensive sweep across all files — grep for stale naming, check broken links, verify no net content loss. Fix everything you find in batch.
Commit with discipline
Each review round produces exactly one commit. The prefixes tell a story: docs: for initial work, review-fix: for reviewer findings, fix: for pre-sweep catches. When you need to roll back, you know exactly which commit to revert.
Unblock the reviewer
With the pre-existing noise gone and obvious bugs already dead, release the reviewer back into the codebase. They still start fresh — but now they’re hunting real issues, not stale strings.
Sweep concept: exponential decay from 11 down to 0 across 7 review rounds
Before your next PR goes to review, add a pre-sweep to your workflow. Run it before you unblock. That one change cuts rounds by half on your second project.
Sweep when you can name all the remaining categories of bug. Unblock when you can’t. That’s the whole rule.
What we’d tell our past selves
If we were starting this refactor again, four things:
Set expectations early — you’ll run 3–7 rounds and that’s fine. The decay curve is the signal. A clean pass at round 1 means the reviewer didn’t look hard enough. A clean pass at round 7 means every stale string, broken anchor, and rendering quirk has been hunted down. Tell yourself this before the first review so you don’t panic at round 5.
The Trap
Expecting a clean pass on round 1. When it doesn’t happen, you assume the reviewer is broken or the workflow is wrong. You start tweaking the system instead of trusting the curve.
The Fix
Expect 3–7 rounds by default. If round 1 comes back clean, that’s suspicious — the reviewer didn’t look hard enough. If round 3 still has findings, that’s convergence in progress.
Document quirks the instant they bite you. The patch-tool pipe-eating bug is now in our skill’s pitfalls section. The next project won’t lose a round to it. This compounds — each documented quirk is one fewer surprise in the next review cascade.
compounds Every quirk you document today saves a full review round on the next project. Future you does not need to discover the pipe-eating bug.
Sweep before you unblock, not after. Running your own comprehensive check across every file before releasing the reviewer catches issues in batches. The reviewer still starts fresh on the next run, but the pre-existing noise is gone and the obvious bugs are already dead.
Know when to break the rules. The standing rule is don’t do the work yourself — route everything through the engineer so it lands in a commit with an audit trail. The exception is purely mechanical fixes: a broken link, a stale string replacement, a regex substitution. When the fix is the signal and the reviewer is blocked waiting on you, fix it directly. Everything else goes through the engineer.
The Hermes kanban review loop isn’t a failure mode — it’s a convergence guarantee. The exponential decay of findings per round (11 → 7 → 3 → 2 → 5 → 1 → 0) is the honest shape of quality work when every run starts with a blank slate and a full scan. The levers that make it tolerable are: sweep before you unblock, commit each round separately with review-fix: prefixes, and document tool quirks immediately. Seven rounds sounds like a lot until you realize round 7 means every file was greenlit by a reviewer with no memory, no bias, and no reason to be nice.
Sources
- Hermes Agent Documentation — https://hermes-agent.nousresearch.com/docs/
- Hermes Kanban Refactor Playbook — internal crew documentation
- Kanban Worker Skill —
~/.hermes/skills/devops/kanban-worker/SKILL.md