Fable 5 vs Opus 4.8

Building the same feature twice

Myles Henderson · Builder Night Atlanta · June 2026

A controlled head-to-head. Interrupt me — questions welcome.

The Feature

A Postgres-wire-compatible driver — type minql into any pg client (psql) and get real rows back.

psql client → minql driver (speaks pg wire) → data warehouse → real rows

Read-only
psql-first
No real auth

The Method

Same opening prompt + same Socratic steering ladder to each model
Each in its own worktree, both from an unbuilt repo (no node_modules), acceptEdits mode
Arm A = Fable 5 (MGH-5) · Arm B = Opus 4.8 (MGH-8)
Metrics from Claude Code's OTel logs — billed cost, durations, tokens — plus transcripts

Fairness: Opus's run-1 (MGH-7) was discarded — a suggested reply said "stub the executor," pushing it to a mock. The clean restart (MGH-8) used verbatim prompts. Two recorded asymmetries remain (see divergences).

Scorecard

Metric	Fable 5 (A)	Opus 4.8 (B)
API cost	$31.26	$9.42 (3.3× cheaper)
All-in @ $200/hr	$184.59 (46m)	$206.09 (59m)
Model working time	~34 min	~27 min
Turns (human prompts)	9 (5 steer + 4 shape)	5 (all design, ~0 shape)
Tool calls	235	162
Time to first code	+15 min	+28 min
Time to first real row	~+26 min	+49 min
Runnable handoff	+32 min	~+49 min
Lines / files	796 / 4	597 / 4
Read-only fidelity	enforced	partial (materializing)
Ran end-to-end?	yes (prod data)	yes (dev data)

Tradeoff #1 — Cost

3.3×

cheaper on API alone
($9.42 vs $31.26)

$184.59

Fable all-in
(API + 46 min)

$206.09

Opus all-in
(API + 59 min)

Opus is half the per-token price ($5/$25 vs $10/$50) with ~half the cache reads
Value human time at $200/hr and it flips: 13 extra minutes erase the $21.84 API saving
Breakeven ≈ $101/hr — above that, the faster model is cheaper all-in

Tradeoff #2 — Speed & Launchability

Fable was bias-to-action; Opus deliberated for 28 min (19 Bash, 12 Read, 3 gates) before line one — then coupled to app/RelationManager infra and only ran after standing up a throwaway container.

Behavioral Findings

Opposite order

Fable: build-then-refine — ran first, reshaped syntax live
Opus: discuss-then-build — settled all syntax in design first

Gating

Opus asked to build 3×, stayed in research far longer
Fable just went — strongest reproducible difference

Convergent instincts

Both reached for SELECT/BY/WHERE → steered to pure minql
Both used exactly 2 Explore subagents; both hand-rolled the wire protocol

Autonomy & safety

Opus found the prod wiring itself via a subagent
At the infra wall, it refused to touch the running dev stack

Met & Missed the Steers

Item	Category	Opus	Fable
Read-only	Dropped user requirement	Built the materializing write-path (CREATE/INSERT) — dropped at the plan stage	readonly creds + flattened CTEs, kept in plan & v1
jsep parser	Broken self-promise	Promised "reuse minql's jsep" → shipped a ~130-line string scanner	parsed the whole query with minql's own jsep
row_number	v1 defect, fixed on request	v1 leaked ROW_NUMBER → diagnosed + fixed cleanly when asked	right in v1 (sorts then hides)

Opus dropped the one explicit requirement Fable kept (read-only), broke its own parser promise, and corrected row_number on request. It also punts all types to text and carries an as unknown as cast at the minql/GraphQL seam.

Two Philosophies

Fable 5

Running query in ~15–26 min → unlocked iterating on shape
Implemented every steer correctly
Better-engineered, more robust, more faithful artifact
…for ~3.3× the API cost

Opus 4.8

~3.3× cheaper on API; good at discussing the right design
v1 diverged from its stated design on two steers
No running query until ~+49 min → little shape iteration
Diagnosed + fixed cleanly when asked directly

The Net

Opus is cheaper on API…

↓

…but more expensive all-in once human time counts (breakeven ~$101/hr)

Fable was faster to running — which unlocked refinement — and its v1 kept every requirement
Opus's one apparent advantage (cost) evaporates when the engineer's time counts
The API line item alone misleads.

The Biggest Lever

The human's wording moved the result most — one phrase ("stub the executor") sent run-1 to a mock
The clean run (MGH-8) is the fair read
Several quality findings only surfaced by running the artifact (the leaked ROW_NUMBER)
Static reading alone over-credited Opus

Same task, two philosophies

Fable: speed, first-cut fidelity, lower total cost · Opus: design discussion, cheaper API

Artifacts: baseline-fable5.md · challenger-opus.md · divergences.md · quality-comparison.md · metrics.sql · opening-prompts.md