With great power...
Jonathan Hall
Builder Night · May 11, 2026
The first time I tried Claude Code...
With great power
comes great stupidity
Who I am
- Fractional Gopher
- Daily Blogger • https://boldlygo.tech
- Host of Cup o' Go podcast • https://cupogo.dev
Agenda
- How I work today (Live demo! 🎉)
- Where it still falls short
- What I'm building past it
The setup
- 3 Claude Code sessions, 1 git repo
- 3 skills:
/todo, /tdd-go, /commit
- Each session in its own git worktree
- Auto-accept on edits — review per-cycle, not per-edit
- Plan mode rarely used — plans emerge while working
- House rules in CLAUDE.md (project + global)
Backlog management
/todo
- TODO.md, versioned with the code
- Source of truth for what's next
- "TODO means TODO" — done items are deleted
- Query:
/todo what's next? · /todo quick wins
Memory in the system, not the agent.
Test-first discipline
/tdd-go
- Triggered automatically — a hook reminds Claude on every prompt
- Red → green → refactor as three subagents
- Each in fresh context, gated handoff between them
- The agent can't skip the failing test
Declare success before starting.
Commit gating
/commit + hook
- Lint + tests must pass to commit
- Human-invoked only — Claude can't auto-trigger
- Hook blocks every other path to commit
The only door, every other window locked.
Where it still falls short
The orchestrator is human
- Tool approval on every command
- Serial review of parallel work
- Dispatch, routing, every interruption through me
- Waiting
The bottleneck isn't the agent. It's me.
Telling isn't learning
- Rules in CLAUDE.md get forgotten or overridden
- Long contexts drift from instructions
- The agent stays the same. I get smarter.
- "Memory" features just bloat the prompt
Prompt engineering is fragile at best.
The agent is fallible
- Says wrong things with the same confidence as right things
- Hallucinated APIs, plausible-but-broken reasoning
- I can only catch what I know to check for
- The scary errors are the ones I can't catch
...so are humans.
Replacing stupidity with responsibility
What they're selling
- The magic: "general intelligence"
- The fix: always "more" — bigger context, more prompt, more instruction, ultimately more GPUs, more power plants, more more MORE!
My take
- The magic: pattern matching on steroids
- The fix: specialization — smaller agents, narrower scope, single-purpose tools. Less is more.
What I've learned. What I'm betting on.
Learned
- LLMs amplify the system you put around them
- Declare success before starting
- Make the right path the only path
Betting
- The orchestrator doesn't have to be human
- Keep the agent dumb; make the system smart
- Human gates are scaffolding, not structural
Why Lindy
- Started exploring — nothing coherent existed
- Pieces exist: Finster's reviewer swarm, Yegge's Gas Town, the Wiggum loop, Claude skills
- Nothing puts them together
- And nothing takes antifragility seriously — which I think is foundational
An experiment in code. Some of it will turn out wrong.
Why "Lindy"?
- The Lindy effect: things that last tend to keep lasting
- Antifragile systems get stronger from use, not despite it (Taleb)
- Uses theater terminology internally (scene, beat, take, etc.)
Lindy directs; I produce
- I create a backlog of scenes (potentially prioritized)
- Lindy schedules and dispatches; scenes run autonomously
- I review post-scene
Build the rails
- Hooks and scene gates, not prompt rules
- Minimal prompts, isolated context per take
- Built in: only what's foundational — RED/GREEN (adversarial review for agents)
- Configurable: everything else — evolves through use
To err is ... also human
- Humans are fallible too — we've built systems for that
- Adversarial review
- Define success first
- Static analysis
- Fast recovery
- Lindy adapts them for AI agents
The collaborator changes. The discipline doesn't.
Lindy today
- Sandwiched between an OpenCode skill and raw OpenCode
- Per-scene workflow runs end-to-end
- Has completed real bug fixes — without tests
What's coming
In progress
- TDD loop: produces tests, proves the fix, prevents regression
Then: start building Lindy with Lindy.
Backlog
- Review swarm
- Model selection per beat
- Decomposition
- Multi-scene scheduler
- Web UI
Other goals
- Run on cheaper models where possible
- Not locked to a single provider
- Home lab deployment
- Multi-project orchestration
- Eventually: multi-tenant