What I Built

error-agent

Myles Henderson · Builder Night Atlanta · April 2025

Interrupt me. Ask questions as they come up.

The Problem

Production errors create Linear tickets
All tickets require my intervention to triage and then fix
Some are duplicates, some are not bugs, and some need fixing
I'd rather be building

Linear — error tickets · GitHub — PRs · commscenter.com · AWS

The Pipeline

Linear → open error tickets

↓

Classify — Claude reads the batch

↓

Fix — Claude Code subprocess

↓

GitHub PR + Linear comment

Runs Unattended in AWS

How Claude Runs Inside the Container

const child = spawn("claude", [
  "-p", "-",
  "--dangerously-skip-permissions",
  "--output-format", "text",
], {
  cwd: clonedRepoPath,
  env: sanitizeEnv(process.env),  // allowlist only
});

child.stdin.write(prompt);
child.stdin.end();

Claude Code as a subprocess. Prompt on stdin, result on stdout. Called once for classify, once per fix.

The Program Loop

Step 1: Classify

One Claude call classifies the entire batch:

{
  "tickets": [
    {
      "ticketId": "AI-141",
      "action": "fix",
      "reason": "Null pointer in auth middleware — code bug"
    },
    {
      "ticketId": "AI-142",
      "action": "notabug",
      "reason": "Expected behavior — rate limiter returns 429 by design"
    },
    {
      "ticketId": "AI-143",
      "action": "duplicate",
      "duplicateOf": "AI-142",
      "reason": "Same stack trace as AI-142"
    },
    {
      "ticketId": "AI-144",
      "action": "infra",
      "reason": "Database connection timeout — not a code fix"
    }
  ]
}

Step 2: Fix

For each fix ticket: clone repo, create branch, pipe this into Claude Code:

You are fixing a production error. Here is the ticket:

**Title:** Null pointer in auth middleware
**Priority:** Urgent | **State:** Triage

**Description:**
TypeError: Cannot read properties of null (reading 'userId')
  at AuthMiddleware.verify (src/auth/middleware.ts:47)

## Instructions

1. Diagnose the root cause
2. Find and fix the bug — minimal, targeted changes
3. Do NOT refactor, add features, or write tests
4. Output JSON: { diagnosis, confidence, changedFiles }

Step 3: Ship It

High confidence + changes

Commit & push
Open PR on GitHub
Comment on Linear ticket
Label: auto-fix-pr

Low confidence

Comment with diagnosis
Label: needs-human
Human takes it from here

PRs Welcome

github.com/AI-Batteries-Included/error-agent

MIT License

Linear — error tickets · GitHub — PRs

Appendix

The Container

FROM node:24-slim
USER node

# In production (Fargate):
# read-only filesystem
# --cap-drop=ALL
# no privilege escalation
# secrets in AWS Secrets Manager

Environment allowlist:
  ✓ ANTHROPIC_API_KEY, HOME, PATH, GIT_*
  ✗ GITHUB_TOKEN
  ✗ LINEAR_API_KEY
  ✗ AWS_SECRET_ACCESS_KEY

Everything runs inside this. Even if Claude gets prompt-injected, it can't exfiltrate tokens.

Pluggable Rules

const defaultRules = {
  duplicate: {
    description: "Same underlying issue as another ticket",
    comment: "Closing as duplicate.\n\nReason: {{reason}}",
    label: "duplicate",
    close: true,
  },
  infra: {
    description: "Infrastructure issue, not fixable in code",
    comment: "Classified as infra issue.\n\nReason: {{reason}}",
    label: "infra",
  },
  notabug: {
    description: "Not actionable",
    comment: "Classified as not actionable.\n\nReason: {{reason}}",
    label: "notabug",
  },
};

Duplicates get closed. Infra gets labeled. Fix tickets enter the pipeline.

Provider Pattern

type TicketProvider = {
  fetchOpenTickets: () => Promise<Ticket[]>
  postComment: (id: string, comment: string) => Promise<void>
  addLabel: (id: string, label: string) => Promise<void>
  closeTicket: (id: string) => Promise<void>
}

type SourceControlProvider = {
  clone: (dest: string) => Promise<void>
  createBranch: (dest: string, name: string) => Promise<void>
  commitAndPush: (dest: string, msg: string, branch: string) => Promise<void>
  openPR: (opts: { title, body, branch, base }) => Promise<string>
}

Linear today. Jira tomorrow. GitHub today. GitLab tomorrow.

The Full Loop

Errors hit production

↓

Monitoring creates Linear tickets

↓

error-agent classifies + fixes

↓

PRs land, tickets close

↓

New errors → new tickets → agent runs again

What I Learned

Classification is the highest-leverage step
Confidence gating prevents bad PRs
The hard part is the plumbing, not the AI