Claude Code Bug Triage: From Stack Trace to Root Cause

Most claude code bug triage starts the same way: a one-line report in Slack, a screenshot, and a stack trace that points at line 412 of a file nobody has touched in eight months. The slow part was never the fix. It was reconstructing what actually happened — finding the failing path, reproducing it locally, and proving you understood the cause before you changed a line. That reconstruction is exactly the work Claude Code can take off your plate, because it runs inside your repository and can verify its own theory against your real code.

A senior engineer on a 40-person platform team described the old loop to me: a P2 bug would sit in the backlog for two days, not because it was hard, but because nobody had a free hour to page in the context. The triage tax — the cost of remembering how a subsystem works — was higher than the fix tax. Claude Code flips that ratio. It pages in the context for you, then hands you a reproduction and a root-cause hypothesis you can accept or reject in minutes.

Why Triage Is the Expensive Part

Engineers underestimate how much of debugging is archaeology. You read the trace, open three files, scroll a git blame, and try to hold the data flow in your head. By the time you understand the bug, you’ve spent forty minutes and written zero lines of code.

The friction is context loading, not problem solving. A bug in an auth-token refresh path requires you to remember how tokens are minted, where they’re cached, and which middleware reads them. If you wrote that code last quarter, the recall is cheap. If a teammate wrote it, you’re starting cold.

Claude Code is good at exactly this archaeology because it reads the whole call path at once. Point it at the failing line and it traces the data backward through the functions that produced the bad state — the same work you’d do by hand, minus the scrolling. For the broader picture of where this fits in a team’s workflow, the Claude Code for engineering teams hub covers how triage sits alongside review and onboarding.

A Runnable Triage Session

Here’s a concrete walkthrough. Imagine a Node service throwing in production with this trace, captured from your error tracker:

TypeError: Cannot read properties of undefined (reading 'id')
    at resolveTenant (src/middleware/tenant.ts:48:31)
    at processRequest (src/server/handler.ts:112:20)
    at /app/node_modules/express/lib/router/layer.js:95:5

Start Claude Code in the repo root and give it the trace plus the one fact you know — the request that failed was a webhook callback, not a normal user request:

claude

Then, in the session, frame the problem the way you’d brief a teammate:

A webhook callback is throwing this in production:

TypeError: Cannot read properties of undefined (reading 'id')
    at resolveTenant (src/middleware/tenant.ts:48:31)
    at processRequest (src/server/handler.ts:112:20)

Read tenant.ts and handler.ts, explain what state has to be
undefined for line 48 to throw, then write a failing test in
the existing Vitest setup that reproduces it. Don't fix it yet.

Asking for the failing test first — before any fix — is the move that makes the session trustworthy. Claude reads the two files, finds that resolveTenant expects req.session.tenant to be populated by an upstream middleware that webhook routes skip, and writes a reproduction:

import { describe, it, expect } from 'vitest';
import { resolveTenant } from '../src/middleware/tenant';
import { makeReq, makeRes } from './helpers';

describe('resolveTenant on webhook routes', () => {
  it('throws when session middleware was skipped', () => {
    const req = makeReq({ path: '/webhooks/stripe' }); // no session
    const res = makeRes();
    const next = () => {};
    expect(() => resolveTenant(req, res, next)).toThrow(
      "Cannot read properties of undefined (reading 'id')"
    );
  });
});

Run it and watch it fail for the real reason:

npx vitest run tests/tenant.test.ts

Now you have something a stack trace alone never gives you: a deterministic reproduction you can put in front of a teammate. The root cause is no longer a guess. Webhook routes bypass session middleware, so req.session is undefined, and resolveTenant dereferences it without a guard.

Only now do you ask for the fix:

Good. Now fix resolveTenant so webhook routes resolve the tenant
from the signed webhook payload instead of the session, and make
the test assert the resolved tenant id. Keep the user-request path
unchanged.

Because the failing test already exists, you’ll know the fix worked the moment the test turns green — no manual poking required. This is the same write-validate-self-correct loop that makes Claude Code useful for production work generally, applied to a bug instead of a feature.

Give Claude the Inputs It Can’t Guess

The quality of a triage session tracks the quality of what you hand over. A stack trace is the floor, not the ceiling. The more of the real failure conditions you provide, the faster Claude stops guessing and starts confirming.

Three inputs matter most. The first is the failing request or payload — the actual JSON body, headers, or arguments that triggered the error, scrubbed of secrets. The second is the surrounding log lines, which often reveal the bad state two steps before the throw. The third is the version: “this started after the 2.14 deploy” lets Claude scope a git log to the commits that could have introduced it.

A payments team I worked with kept losing time because triage sessions started from the trace alone. Once they pasted the redacted webhook body alongside it, Claude could construct the exact failing input on the first try instead of inferring a plausible one. The reproduction went from “close enough” to “identical.” For how to encode this kind of standing context so you don’t repeat it every session, Anthropic’s Claude Code memory documentation describes the persistent-context pattern.

Make Triage a Repeatable Command

If you triage bugs the same way every week, stop retyping the prompt. Claude Code supports custom slash commands stored in your repo, so you can capture the whole reproduce-then-diagnose flow once and run it on demand.

Create a .claude/commands/triage.md file with the steps you always follow:

Triage the bug described below.

1. Read every file named in the stack trace.
2. Explain in 3-4 sentences what state must be wrong for the
   error to occur. Cite specific lines.
3. Write a failing test in our Vitest setup that reproduces it.
   Do not write a fix yet.
4. Run the test and confirm it fails for the stated reason.
5. Wait for my approval before proposing a fix.

Bug:
$ARGUMENTS

Now any engineer can run /triage and paste a report, and the session follows your team’s standard. The custom slash commands documentation covers the $ARGUMENTS placeholder and where command files live. This turns triage from tribal knowledge into a checked-in routine that a new hire can run on day one.

The same instinct extends to enforcement. A post-run hook that automatically executes the reproduction test after Claude writes it means the engineer never has to remember to verify — the loop closes itself. Anthropic’s Claude Code overview describes the agentic patterns that make this kind of self-checking practical.

What Still Belongs to You

Claude Code reproduces and diagnoses well. It does not own the call about whether a fix is the right one. A guard clause that swallows the undefined makes the test pass and hides a deeper problem: webhook routes are silently running without a tenant. The model will happily write the guard if you ask for it. Deciding that the real fix is to resolve the tenant from the payload — that judgment is yours.

Treat the failing test and the root-cause explanation as the trustworthy output, and treat the fix as a proposal. Read the diagnosis closely enough that you could defend it in review. If Claude’s explanation of the bad state doesn’t fully add up, that’s a signal to dig, not to merge. The reproduction is what lets you tell the difference between a real understanding and a plausible story.

This is the same division of labor that runs through good AI-assisted engineering: hand off the mechanical reconstruction, keep the judgment. It mirrors how teams split work in Claude Code code review automation, where the tool does the first pass and humans own design decisions. It also pairs naturally with Claude Code PR description automation, since a triage session already produced the reproduction and root cause that a good PR description needs.

Frequently Asked Questions

Can Claude Code reproduce a bug from just a stack trace?

Often, yes. A stack trace gives Claude the failing call path, and from there it can locate the relevant code, infer the inputs that trigger the failure, and write a test that reproduces it. The more context you provide — request payloads, log lines, the affected version — the faster it converges on a faithful reproduction.

Should I let Claude Code fix the bug automatically?

Let it propose the fix and the test, but review both before you merge. The highest-value output of a claude code bug triage session is a failing test plus a clear root-cause explanation. The fix is the easy part once you trust the diagnosis, and a human still owns the merge decision.

How is this different from pasting a stack trace into a chat window?

Claude Code runs inside your repository. It can read the actual files in the call path, run your test suite, execute the reproduction, and confirm the fix turns the test green — all in one session. A chat window guesses from text; Claude Code verifies against your real codebase.

Where should a new engineer start with bug triage?

Start by running one real bug through the reproduce-first flow before reaching for a fix. Pairing it with a strong onboarding setup helps; the approach in Claude Code engineering team onboarding gets new hires productive on the codebase fast, which makes their first triage sessions land faster too.

Pick the oldest P2 in your backlog — the one that’s been sitting because nobody had the context to start. Open Claude Code in the repo, paste the trace, and ask for the failing test before anything else. If you want the guided version with the triage command, hook setup, and reproduction patterns built out end to end, the Claude Code course walks through the full workflow.