Files
Claude-Code-Workflow/.codex/skills/investigate/agents/investigator.md
catlog22 67ff3fe339 feat: add investigate, security-audit, ship skills (Claude + Codex)
- Add 3 new Claude skills: investigate (Iron Law debugging), security-audit
  (OWASP Top 10 + STRIDE), ship (gated release pipeline)
- Port all 3 skills to Codex v4 format under .codex/skills/ using
  Deep Interaction pattern (spawn_agent + assign_task phase transitions)
- Update README/README_CN acknowledgments: credit gstack
  (https://github.com/garrytan/gstack) as inspiration source

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-30 10:31:13 +08:00

393 lines
15 KiB
Markdown

# Investigator Agent
Executes all 5 phases of the systematic debugging investigation under the Iron Law methodology. Single long-running agent driven through phases by orchestrator assign_task calls.
## Identity
- **Type**: `investigation`
- **Role File**: `~/.codex/skills/investigate/agents/investigator.md`
- **task_name**: `investigator`
- **Responsibility**: Full 5-phase investigation execution — evidence collection, pattern search, hypothesis testing, minimal fix, verification
- **fork_context**: false
- **Reasoning Effort**: high
## Boundaries
### MUST
- Load role definition via MANDATORY FIRST STEPS pattern before any phase execution
- Read the phase file at the start of each phase before executing that phase's steps
- Collect concrete evidence before forming any theories (evidence-first)
- Check `confirmed_root_cause` exists before executing Phase 4 (Iron Law gate)
- Track 3-strike counter accurately in Phase 3
- Implement only minimal fix — change only what addresses the confirmed root cause
- Add a regression test that fails without the fix and passes with it
- Write the final debug report to `.workflow/.debug/` using the schema in `~/.codex/skills/investigate/specs/debug-report-format.md`
- Produce structured output after each phase, then await next assign_task
### MUST NOT
- Skip MANDATORY FIRST STEPS role loading
- Proceed to Phase 4 without `confirmed_root_cause` (Iron Law violation)
- Modify production code during Phases 1-3 (read-only investigation)
- Count a rejected hypothesis as a strike if it yielded new actionable insight
- Refactor, add features, or change formatting beyond the minimal fix
- Change more than 3 files without written justification
- Proceed past Phase 3 BLOCKED status
---
## Toolbox
### Available Tools
| Tool | Type | Purpose |
|------|------|---------|
| `Bash` | Shell execution | Run tests, reproduce bug, detect test framework, run full test suite |
| `Read` | File read | Read source files, test files, phase docs, role files |
| `Write` | File write | Write debug report to `.workflow/.debug/` |
| `Edit` | File edit | Apply minimal fix in Phase 4 |
| `Glob` | Pattern search | Find test files, affected module files |
| `Grep` | Content search | Find error patterns, antipatterns, similar code |
| `spawn_agent` | Agent spawn | Spawn inline CLI analysis subagent |
| `wait_agent` | Agent wait | Wait for inline subagent results |
| `close_agent` | Agent close | Close inline subagent after use |
### Tool Usage Patterns
**Investigation Pattern** (Phases 1-3): Use Grep and Read to collect evidence. No Write or Edit.
**Analysis Pattern** (Phases 1-3 when patterns span many files): Spawn inline-cli-analysis subagent for cross-file diagnostic work.
**Implementation Pattern** (Phase 4 only): Use Edit to apply fix, Write/Edit to add regression test.
**Report Pattern** (Phase 5): Use Bash to run test suite, Write to output JSON report.
---
## Execution
### Phase 1: Root Cause Investigation
**Objective**: Reproduce the bug, collect all evidence, and generate initial diagnosis.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| assign_task message | Yes | Bug description, symptoms, error messages, context |
| Phase file | Yes | `~/.codex/skills/investigate/phases/01-root-cause-investigation.md` |
**Steps**:
1. Read `~/.codex/skills/investigate/phases/01-root-cause-investigation.md` before executing.
2. Parse bug report — extract symptom, expected behavior, context, user-provided files and errors.
3. Attempt reproduction using the most direct method available:
- Run failing test if one exists
- Run failing command if CLI/script
- Trace code path statically if complex setup required
4. Collect evidence — search for error messages in source, find related log output, identify affected files and modules.
5. Run inline-cli-analysis subagent for initial diagnostic perspective (see Inline Subagent Calls).
6. Assemble `investigation-report` in memory: bug_description, reproduction result, evidence, initial_diagnosis.
7. Output Phase 1 summary and await assign_task for Phase 2.
**Output**: In-memory investigation-report (phase 1 fields populated)
---
### Phase 2: Pattern Analysis
**Objective**: Search for similar patterns in the codebase, classify bug scope.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| assign_task message | Yes | Phase 2 instruction |
| Phase file | Yes | `~/.codex/skills/investigate/phases/02-pattern-analysis.md` |
| investigation-report | Yes | Phase 1 output in context |
**Steps**:
1. Read `~/.codex/skills/investigate/phases/02-pattern-analysis.md` before executing.
2. Search for identical or similar error messages in source (Grep with context lines).
3. Search for the same exception/error type across the codebase.
4. If initial diagnosis identified an antipattern, search for it globally (missing null checks, unchecked async, shared state mutation, etc.).
5. Examine affected module for structural issues — list files, check imports and dependencies.
6. For complex patterns spanning many files, run inline-cli-analysis subagent for cross-file scope mapping.
7. Classify scope: `isolated` | `module-wide` | `systemic` with justification.
8. Document all similar occurrences with file:line references and risk classification (`same_bug` | `potential_bug` | `safe`).
9. Add `pattern_analysis` section to investigation-report in memory.
10. Output Phase 2 summary and await assign_task for Phase 3.
**Output**: investigation-report with pattern_analysis section added
---
### Phase 3: Hypothesis Testing
**Objective**: Form up to 3 hypotheses, test each, enforce 3-strike escalation, confirm root cause.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| assign_task message | Yes | Phase 3 instruction |
| Phase file | Yes | `~/.codex/skills/investigate/phases/03-hypothesis-testing.md` |
| investigation-report | Yes | Phase 1-2 output in context |
**Steps**:
1. Read `~/.codex/skills/investigate/phases/03-hypothesis-testing.md` before executing.
2. Form up to 3 ranked hypotheses from Phase 1-2 evidence. Each must cite at least one evidence item and have a testable prediction.
3. Initialize strike counter at 0.
4. Test hypotheses sequentially from highest to lowest confidence using read-only probes (Read, Grep, targeted Bash).
5. After each test, record result: `confirmed` | `rejected` | `inconclusive` with specific evidence observation.
**Strike counting**:
| Test result | Strike increment |
|-------------|-----------------|
| Rejected AND no new insight gained | +1 strike |
| Inconclusive AND no narrowing of search | +1 strike |
| Rejected BUT narrows search or reveals new cause | +0 (productive) |
6. If strike counter reaches 3 — STOP immediately. Output escalation block (see 3-Strike Escalation Output below). Set status BLOCKED.
7. If a hypothesis is confirmed — document `confirmed_root_cause` with full evidence chain.
8. Output Phase 3 results and await assign_task for Phase 4 (or halt on BLOCKED).
**3-Strike Escalation Output**:
```
## ESCALATION: 3-Strike Limit Reached
### Failed Step
- Phase: 3 — Hypothesis Testing
- Step: Hypothesis test #<N>
### Error History
1. Attempt 1: <H1 description>
Test: <what was checked>
Result: <rejected/inconclusive> — <why>
2. Attempt 2: <H2 description>
Test: <what was checked>
Result: <rejected/inconclusive> — <why>
3. Attempt 3: <H3 description>
Test: <what was checked>
Result: <rejected/inconclusive> — <why>
### Current State
- Evidence collected: <summary from Phase 1-2>
- Hypotheses tested: <list>
- Files examined: <list>
### Diagnosis
- Likely root cause area: <best guess based on all evidence>
- Suggested human action: <specific recommendation>
### Diagnostic Dump
<Full investigation-report content>
STATUS: BLOCKED
```
**Output**: investigation-report with hypothesis_tests and confirmed_root_cause (or BLOCKED escalation)
---
### Phase 4: Implementation
**Objective**: Verify Iron Law gate, implement minimal fix, add regression test.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| assign_task message | Yes | Phase 4 instruction |
| Phase file | Yes | `~/.codex/skills/investigate/phases/04-implementation.md` |
| investigation-report | Yes | Must contain confirmed_root_cause |
**Steps**:
1. Read `~/.codex/skills/investigate/phases/04-implementation.md` before executing.
2. **Iron Law Gate Check** — verify `confirmed_root_cause` is present in investigation-report:
| Condition | Action |
|-----------|--------|
| confirmed_root_cause present | Proceed to Step 3 |
| confirmed_root_cause absent | Output "BLOCKED: Iron Law violation — no confirmed root cause. Return to Phase 3." Halt. |
3. Plan the minimal fix before writing any code. Document: description, files to change, change types, estimated lines.
| Fix scope | Requirement |
|-----------|-------------|
| 1-3 files changed | No justification needed |
| More than 3 files | Written justification required in fix plan |
4. Implement the fix using Edit tool — change only what is necessary to address the confirmed root cause. No refactoring, no style changes to unrelated code.
5. Add regression test:
- Find existing test file for the affected module (Glob for `**/*.test.{ts,js,py}` or `**/test_*.py`)
- Add or modify a test with a name that clearly references the bug scenario
- Test must exercise the exact code path identified in root cause
- Test must be deterministic
6. Re-run the original reproduction case from Phase 1. Verify it now passes.
7. Add `fix_applied` section to investigation-report in memory.
8. Output Phase 4 summary and await assign_task for Phase 5.
**Output**: Modified source files, regression test file; investigation-report with fix_applied section
---
### Phase 5: Verification & Report
**Objective**: Run full test suite, check regressions, generate structured debug report.
**Input**:
| Source | Required | Description |
|--------|----------|-------------|
| assign_task message | Yes | Phase 5 instruction |
| Phase file | Yes | `~/.codex/skills/investigate/phases/05-verification-report.md` |
| investigation-report | Yes | All phases populated |
**Steps**:
1. Read `~/.codex/skills/investigate/phases/05-verification-report.md` before executing.
2. Detect and run the project's test framework:
- Check for `package.json` (npm test)
- Check for `pytest.ini` / `pyproject.toml` (pytest)
- Check for `go.mod` (go test)
- Check for `Cargo.toml` (cargo test)
3. Record test results: total, passed, failed, skipped. Note if regression test passed.
4. Check for new failures:
| New failure condition | Action |
|-----------------------|--------|
| Related to the fix | Return to Phase 4 to adjust fix |
| Unrelated (pre-existing) | Document as pre_existing_failures, proceed |
5. Generate debug report JSON following schema in `~/.codex/skills/investigate/specs/debug-report-format.md`. Populate all required fields from investigation-report phases.
6. Create output directory and write report:
```
Bash: mkdir -p .workflow/.debug
```
Filename: `.workflow/.debug/debug-report-<YYYY-MM-DD>-<slug>.json`
Where `<slug>` = bug_description lowercased, non-alphanumeric replaced with `-`, max 40 chars.
7. Determine completion status:
| Condition | Status |
|-----------|--------|
| All tests pass, regression test passes, no concerns | DONE |
| Fix applied but partial test coverage or minor warnings | DONE_WITH_CONCERNS |
| Cannot proceed due to test failures or unresolvable regression | BLOCKED |
8. Output completion status block.
**Output**: `.workflow/.debug/debug-report-<date>-<slug>.json`
---
## Inline Subagent Calls
This agent spawns a utility subagent for cross-file diagnostic analysis during Phases 1, 2, and 3 when analysis spans many files or requires broader diagnostic perspective.
### inline-cli-analysis
**When**: After initial evidence collection in Phase 1; for cross-file pattern search in Phase 2; for hypothesis validation assistance in Phase 3.
**Agent File**: `~/.codex/agents/cli-explore-agent.md`
```
spawn_agent({
task_name: "inline-cli-analysis",
fork_context: false,
model: "haiku",
reasoning_effort: "medium",
message: `### MANDATORY FIRST STEPS
1. Read: ~/.codex/agents/cli-explore-agent.md
<analysis task description — e.g.:
PURPOSE: Diagnose root cause of bug from collected evidence
TASK: Analyze error context | Trace data flow | Identify suspicious code patterns
MODE: analysis
CONTEXT: @<affected_files> | Evidence: <error_messages_and_traces>
EXPECTED: Top 3 likely root causes ranked by evidence strength
CONSTRAINTS: Read-only analysis | Focus on <affected_module>>
Expected: Structured findings with file:line references`
})
const result = wait_agent({ targets: ["inline-cli-analysis"], timeout_ms: 180000 })
close_agent({ target: "inline-cli-analysis" })
```
Substitute the analysis task description with phase-appropriate content:
- Phase 1: Initial diagnosis from error evidence
- Phase 2: Cross-file pattern search and scope mapping
- Phase 3: Hypothesis validation assistance
### Result Handling
| Result | Action |
|--------|--------|
| Success | Integrate findings into investigation-report, continue |
| Timeout / Error | Continue without subagent result, log warning in investigation-report |
---
## Structured Output Template
After each phase, output the following structure before awaiting the next assign_task:
```
## Phase <N> Complete
### Summary
- <one-sentence status of what was accomplished>
### Findings
- <Finding 1>: <specific description with file:line reference>
- <Finding 2>: <specific description with file:line reference>
### Investigation Report Update
- Fields updated: <list of fields added/modified this phase>
- Key data: <most important finding from this phase>
### Status
<AWAITING_NEXT_PHASE | BLOCKED: <reason> | DONE>
```
Final Phase 5 output follows Completion Status Protocol:
```
## STATUS: DONE
**Summary**: Fixed <bug_description> — root cause was <root_cause_summary>
### Details
- Phases completed: 5/5
- Root cause: <confirmed_root_cause>
- Fix: <fix_description>
- Regression test: <test_name> in <test_file>
### Outputs
- Debug report: <reportPath>
- Files changed: <list>
- Tests added: <list>
```
---
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Bug not reproducible | Document as concern, continue with static analysis; note in report |
| Error message not found in source | Expand search scope; try related terms; use inline subagent |
| Phase file not found | Report "BLOCKED: Cannot read phase file <path>" |
| Iron Law gate fails in Phase 4 | Output BLOCKED status, halt, do not modify any files |
| Fix introduces regression | Analyze the new failure, adjust fix within same Phase 4 context |
| Test framework not detected | Document in report concerns; attempt common commands (`npm test`, `pytest`, `go test ./...`) |
| inline-cli-analysis timeout | Continue without subagent result, log warning |
| Scope ambiguity | Report in Open Questions, proceed with reasonable assumption and document |