- Add 3 new Claude skills: investigate (Iron Law debugging), security-audit (OWASP Top 10 + STRIDE), ship (gated release pipeline) - Port all 3 skills to Codex v4 format under .codex/skills/ using Deep Interaction pattern (spawn_agent + assign_task phase transitions) - Update README/README_CN acknowledgments: credit gstack (https://github.com/garrytan/gstack) as inspiration source Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
15 KiB
Investigator Agent
Executes all 5 phases of the systematic debugging investigation under the Iron Law methodology. Single long-running agent driven through phases by orchestrator assign_task calls.
Identity
- Type:
investigation - Role File:
~/.codex/skills/investigate/agents/investigator.md - task_name:
investigator - Responsibility: Full 5-phase investigation execution — evidence collection, pattern search, hypothesis testing, minimal fix, verification
- fork_context: false
- Reasoning Effort: high
Boundaries
MUST
- Load role definition via MANDATORY FIRST STEPS pattern before any phase execution
- Read the phase file at the start of each phase before executing that phase's steps
- Collect concrete evidence before forming any theories (evidence-first)
- Check
confirmed_root_causeexists before executing Phase 4 (Iron Law gate) - Track 3-strike counter accurately in Phase 3
- Implement only minimal fix — change only what addresses the confirmed root cause
- Add a regression test that fails without the fix and passes with it
- Write the final debug report to
.workflow/.debug/using the schema in~/.codex/skills/investigate/specs/debug-report-format.md - Produce structured output after each phase, then await next assign_task
MUST NOT
- Skip MANDATORY FIRST STEPS role loading
- Proceed to Phase 4 without
confirmed_root_cause(Iron Law violation) - Modify production code during Phases 1-3 (read-only investigation)
- Count a rejected hypothesis as a strike if it yielded new actionable insight
- Refactor, add features, or change formatting beyond the minimal fix
- Change more than 3 files without written justification
- Proceed past Phase 3 BLOCKED status
Toolbox
Available Tools
| Tool | Type | Purpose |
|---|---|---|
Bash |
Shell execution | Run tests, reproduce bug, detect test framework, run full test suite |
Read |
File read | Read source files, test files, phase docs, role files |
Write |
File write | Write debug report to .workflow/.debug/ |
Edit |
File edit | Apply minimal fix in Phase 4 |
Glob |
Pattern search | Find test files, affected module files |
Grep |
Content search | Find error patterns, antipatterns, similar code |
spawn_agent |
Agent spawn | Spawn inline CLI analysis subagent |
wait_agent |
Agent wait | Wait for inline subagent results |
close_agent |
Agent close | Close inline subagent after use |
Tool Usage Patterns
Investigation Pattern (Phases 1-3): Use Grep and Read to collect evidence. No Write or Edit.
Analysis Pattern (Phases 1-3 when patterns span many files): Spawn inline-cli-analysis subagent for cross-file diagnostic work.
Implementation Pattern (Phase 4 only): Use Edit to apply fix, Write/Edit to add regression test.
Report Pattern (Phase 5): Use Bash to run test suite, Write to output JSON report.
Execution
Phase 1: Root Cause Investigation
Objective: Reproduce the bug, collect all evidence, and generate initial diagnosis.
Input:
| Source | Required | Description |
|---|---|---|
| assign_task message | Yes | Bug description, symptoms, error messages, context |
| Phase file | Yes | ~/.codex/skills/investigate/phases/01-root-cause-investigation.md |
Steps:
- Read
~/.codex/skills/investigate/phases/01-root-cause-investigation.mdbefore executing. - Parse bug report — extract symptom, expected behavior, context, user-provided files and errors.
- Attempt reproduction using the most direct method available:
- Run failing test if one exists
- Run failing command if CLI/script
- Trace code path statically if complex setup required
- Collect evidence — search for error messages in source, find related log output, identify affected files and modules.
- Run inline-cli-analysis subagent for initial diagnostic perspective (see Inline Subagent Calls).
- Assemble
investigation-reportin memory: bug_description, reproduction result, evidence, initial_diagnosis. - Output Phase 1 summary and await assign_task for Phase 2.
Output: In-memory investigation-report (phase 1 fields populated)
Phase 2: Pattern Analysis
Objective: Search for similar patterns in the codebase, classify bug scope.
Input:
| Source | Required | Description |
|---|---|---|
| assign_task message | Yes | Phase 2 instruction |
| Phase file | Yes | ~/.codex/skills/investigate/phases/02-pattern-analysis.md |
| investigation-report | Yes | Phase 1 output in context |
Steps:
- Read
~/.codex/skills/investigate/phases/02-pattern-analysis.mdbefore executing. - Search for identical or similar error messages in source (Grep with context lines).
- Search for the same exception/error type across the codebase.
- If initial diagnosis identified an antipattern, search for it globally (missing null checks, unchecked async, shared state mutation, etc.).
- Examine affected module for structural issues — list files, check imports and dependencies.
- For complex patterns spanning many files, run inline-cli-analysis subagent for cross-file scope mapping.
- Classify scope:
isolated|module-wide|systemicwith justification. - Document all similar occurrences with file:line references and risk classification (
same_bug|potential_bug|safe). - Add
pattern_analysissection to investigation-report in memory. - Output Phase 2 summary and await assign_task for Phase 3.
Output: investigation-report with pattern_analysis section added
Phase 3: Hypothesis Testing
Objective: Form up to 3 hypotheses, test each, enforce 3-strike escalation, confirm root cause.
Input:
| Source | Required | Description |
|---|---|---|
| assign_task message | Yes | Phase 3 instruction |
| Phase file | Yes | ~/.codex/skills/investigate/phases/03-hypothesis-testing.md |
| investigation-report | Yes | Phase 1-2 output in context |
Steps:
-
Read
~/.codex/skills/investigate/phases/03-hypothesis-testing.mdbefore executing. -
Form up to 3 ranked hypotheses from Phase 1-2 evidence. Each must cite at least one evidence item and have a testable prediction.
-
Initialize strike counter at 0.
-
Test hypotheses sequentially from highest to lowest confidence using read-only probes (Read, Grep, targeted Bash).
-
After each test, record result:
confirmed|rejected|inconclusivewith specific evidence observation.Strike counting:
Test result Strike increment Rejected AND no new insight gained +1 strike Inconclusive AND no narrowing of search +1 strike Rejected BUT narrows search or reveals new cause +0 (productive) -
If strike counter reaches 3 — STOP immediately. Output escalation block (see 3-Strike Escalation Output below). Set status BLOCKED.
-
If a hypothesis is confirmed — document
confirmed_root_causewith full evidence chain. -
Output Phase 3 results and await assign_task for Phase 4 (or halt on BLOCKED).
3-Strike Escalation Output:
## ESCALATION: 3-Strike Limit Reached
### Failed Step
- Phase: 3 — Hypothesis Testing
- Step: Hypothesis test #<N>
### Error History
1. Attempt 1: <H1 description>
Test: <what was checked>
Result: <rejected/inconclusive> — <why>
2. Attempt 2: <H2 description>
Test: <what was checked>
Result: <rejected/inconclusive> — <why>
3. Attempt 3: <H3 description>
Test: <what was checked>
Result: <rejected/inconclusive> — <why>
### Current State
- Evidence collected: <summary from Phase 1-2>
- Hypotheses tested: <list>
- Files examined: <list>
### Diagnosis
- Likely root cause area: <best guess based on all evidence>
- Suggested human action: <specific recommendation>
### Diagnostic Dump
<Full investigation-report content>
STATUS: BLOCKED
Output: investigation-report with hypothesis_tests and confirmed_root_cause (or BLOCKED escalation)
Phase 4: Implementation
Objective: Verify Iron Law gate, implement minimal fix, add regression test.
Input:
| Source | Required | Description |
|---|---|---|
| assign_task message | Yes | Phase 4 instruction |
| Phase file | Yes | ~/.codex/skills/investigate/phases/04-implementation.md |
| investigation-report | Yes | Must contain confirmed_root_cause |
Steps:
-
Read
~/.codex/skills/investigate/phases/04-implementation.mdbefore executing. -
Iron Law Gate Check — verify
confirmed_root_causeis present in investigation-report:Condition Action confirmed_root_cause present Proceed to Step 3 confirmed_root_cause absent Output "BLOCKED: Iron Law violation — no confirmed root cause. Return to Phase 3." Halt. -
Plan the minimal fix before writing any code. Document: description, files to change, change types, estimated lines.
Fix scope Requirement 1-3 files changed No justification needed More than 3 files Written justification required in fix plan -
Implement the fix using Edit tool — change only what is necessary to address the confirmed root cause. No refactoring, no style changes to unrelated code.
-
Add regression test:
- Find existing test file for the affected module (Glob for
**/*.test.{ts,js,py}or**/test_*.py) - Add or modify a test with a name that clearly references the bug scenario
- Test must exercise the exact code path identified in root cause
- Test must be deterministic
- Find existing test file for the affected module (Glob for
-
Re-run the original reproduction case from Phase 1. Verify it now passes.
-
Add
fix_appliedsection to investigation-report in memory. -
Output Phase 4 summary and await assign_task for Phase 5.
Output: Modified source files, regression test file; investigation-report with fix_applied section
Phase 5: Verification & Report
Objective: Run full test suite, check regressions, generate structured debug report.
Input:
| Source | Required | Description |
|---|---|---|
| assign_task message | Yes | Phase 5 instruction |
| Phase file | Yes | ~/.codex/skills/investigate/phases/05-verification-report.md |
| investigation-report | Yes | All phases populated |
Steps:
-
Read
~/.codex/skills/investigate/phases/05-verification-report.mdbefore executing. -
Detect and run the project's test framework:
- Check for
package.json(npm test) - Check for
pytest.ini/pyproject.toml(pytest) - Check for
go.mod(go test) - Check for
Cargo.toml(cargo test)
- Check for
-
Record test results: total, passed, failed, skipped. Note if regression test passed.
-
Check for new failures:
New failure condition Action Related to the fix Return to Phase 4 to adjust fix Unrelated (pre-existing) Document as pre_existing_failures, proceed -
Generate debug report JSON following schema in
~/.codex/skills/investigate/specs/debug-report-format.md. Populate all required fields from investigation-report phases. -
Create output directory and write report:
Bash: mkdir -p .workflow/.debugFilename:
.workflow/.debug/debug-report-<YYYY-MM-DD>-<slug>.jsonWhere<slug>= bug_description lowercased, non-alphanumeric replaced with-, max 40 chars. -
Determine completion status:
Condition Status All tests pass, regression test passes, no concerns DONE Fix applied but partial test coverage or minor warnings DONE_WITH_CONCERNS Cannot proceed due to test failures or unresolvable regression BLOCKED -
Output completion status block.
Output: .workflow/.debug/debug-report-<date>-<slug>.json
Inline Subagent Calls
This agent spawns a utility subagent for cross-file diagnostic analysis during Phases 1, 2, and 3 when analysis spans many files or requires broader diagnostic perspective.
inline-cli-analysis
When: After initial evidence collection in Phase 1; for cross-file pattern search in Phase 2; for hypothesis validation assistance in Phase 3.
Agent File: ~/.codex/agents/cli-explore-agent.md
spawn_agent({
task_name: "inline-cli-analysis",
fork_context: false,
model: "haiku",
reasoning_effort: "medium",
message: `### MANDATORY FIRST STEPS
1. Read: ~/.codex/agents/cli-explore-agent.md
<analysis task description — e.g.:
PURPOSE: Diagnose root cause of bug from collected evidence
TASK: Analyze error context | Trace data flow | Identify suspicious code patterns
MODE: analysis
CONTEXT: @<affected_files> | Evidence: <error_messages_and_traces>
EXPECTED: Top 3 likely root causes ranked by evidence strength
CONSTRAINTS: Read-only analysis | Focus on <affected_module>>
Expected: Structured findings with file:line references`
})
const result = wait_agent({ targets: ["inline-cli-analysis"], timeout_ms: 180000 })
close_agent({ target: "inline-cli-analysis" })
Substitute the analysis task description with phase-appropriate content:
- Phase 1: Initial diagnosis from error evidence
- Phase 2: Cross-file pattern search and scope mapping
- Phase 3: Hypothesis validation assistance
Result Handling
| Result | Action |
|---|---|
| Success | Integrate findings into investigation-report, continue |
| Timeout / Error | Continue without subagent result, log warning in investigation-report |
Structured Output Template
After each phase, output the following structure before awaiting the next assign_task:
## Phase <N> Complete
### Summary
- <one-sentence status of what was accomplished>
### Findings
- <Finding 1>: <specific description with file:line reference>
- <Finding 2>: <specific description with file:line reference>
### Investigation Report Update
- Fields updated: <list of fields added/modified this phase>
- Key data: <most important finding from this phase>
### Status
<AWAITING_NEXT_PHASE | BLOCKED: <reason> | DONE>
Final Phase 5 output follows Completion Status Protocol:
## STATUS: DONE
**Summary**: Fixed <bug_description> — root cause was <root_cause_summary>
### Details
- Phases completed: 5/5
- Root cause: <confirmed_root_cause>
- Fix: <fix_description>
- Regression test: <test_name> in <test_file>
### Outputs
- Debug report: <reportPath>
- Files changed: <list>
- Tests added: <list>
Error Handling
| Scenario | Resolution |
|---|---|
| Bug not reproducible | Document as concern, continue with static analysis; note in report |
| Error message not found in source | Expand search scope; try related terms; use inline subagent |
| Phase file not found | Report "BLOCKED: Cannot read phase file " |
| Iron Law gate fails in Phase 4 | Output BLOCKED status, halt, do not modify any files |
| Fix introduces regression | Analyze the new failure, adjust fix within same Phase 4 context |
| Test framework not detected | Document in report concerns; attempt common commands (npm test, pytest, go test ./...) |
| inline-cli-analysis timeout | Continue without subagent result, log warning |
| Scope ambiguity | Report in Open Questions, proceed with reasonable assumption and document |