# Phase 3: Hypothesis Testing > **COMPACT PROTECTION**: This is a core execution phase. If context compression has occurred and this file is only a summary, **MUST `Read` this file again before executing any Step**. Do not execute from memory. Form hypotheses from evidence and test each one. Enforce the 3-strike escalation rule. ## Objective - Form a maximum of 3 hypotheses from Phase 1-2 evidence - Test each hypothesis with minimal, read-only probes - Confirm or reject each hypothesis with concrete evidence - Enforce 3-strike rule: STOP and escalate after 3 consecutive unproductive test failures ## Input | Source | Required | Description | |--------|----------|-------------| | investigation-report (phases 1-2) | Yes | Evidence, affected files, pattern analysis, initial suspects | | assign_task message | Yes | Phase 3 instruction | ## Execution Steps ### Step 1: Form Hypotheses Using evidence from Phase 1 (investigation report) and Phase 2 (pattern analysis), form up to 3 ranked hypotheses: **Hypothesis formation rules**: - Each hypothesis must cite at least one piece of evidence from Phase 1-2 - Each hypothesis must have a testable prediction - Rank by confidence (high first) - Maximum 3 hypotheses per investigation Assemble hypotheses in memory: ``` hypotheses = [ { id: "H1", description: "The root cause is because evidence ", evidence_supporting: ["", ""], predicted_behavior: "If H1 is correct, then we should observe ", test_method: "How to verify: read file line , check value ", confidence: "high|medium|low" } ] ``` Initialize strike counter: 0 --- ### Step 2: Test Hypotheses Sequentially Test each hypothesis starting from highest confidence (H1 first). Use read-only probes only during testing. **Allowed test methods**: | Method | Usage | |--------|-------| | Read a specific file | Check a specific value, condition, or code pattern | | Grep for a pattern | Confirm or deny the presence of a condition | | Bash targeted test | Run a specific test that reveals the condition | | Temporary log statement | Add a log to observe runtime behavior; MUST revert after | **Prohibited during hypothesis testing**: - Modifying production code (save for Phase 4) - Changing multiple things at once - Running the full test suite (targeted checks only) --- ### Step 3: Record Test Results For each hypothesis test, record: ``` hypothesis_test = { id: "H1", test_performed: "", result: "confirmed|rejected|inconclusive", evidence: "", files_checked: [""] } ``` --- ### Step 4: 3-Strike Escalation Rule Track consecutive unproductive test failures. After each hypothesis test, evaluate: **Strike evaluation**: | Test result | New insight gained | Strike action | |-------------|-------------------|---------------| | confirmed | — | CONFIRM root cause, end testing | | rejected | Yes — narrows search or reveals new cause | No strike (productive rejection) | | rejected | No — no actionable insight | +1 strike | | inconclusive | Yes — identifies new area | No strike (productive) | | inconclusive | No — no narrowing | +1 strike | **Strike counter tracking**: | Strike count | Action | |--------------|--------| | 1 | Continue to next hypothesis | | 2 | Continue to next hypothesis | | 3 | STOP — output escalation block immediately | **On 3rd Strike — output this escalation block verbatim and halt**: ``` ## ESCALATION: 3-Strike Limit Reached ### Failed Step - Phase: 3 — Hypothesis Testing - Step: Hypothesis test # ### Error History 1. Attempt 1:

Test: Result: — 2. Attempt 2:
Test: Result: — 3. Attempt 3:
Test: Result: — ### Current State - Evidence collected:
- Hypotheses tested: - Files examined: ### Diagnosis - Likely root cause area: - Suggested human action: ### Diagnostic Dump STATUS: BLOCKED ``` After outputting escalation: set status BLOCKED. Do not proceed to Phase 4. --- ### Step 5: Confirm Root Cause If a hypothesis is confirmed, document the confirmed root cause: ``` confirmed_root_cause = { hypothesis_id: "H1", description: "", evidence_chain: [ "Phase 1: ", "Phase 2: ", "Phase 3: H1 confirmed — " ], affected_code: { file: "", line_range: "<42-55>", function: "" } } ``` Add `hypothesis_tests` and `confirmed_root_cause` to investigation-report in memory. Output Phase 3 results and await assign_task for Phase 4. --- ## Output | Artifact | Format | Description | |----------|--------|-------------| | investigation-report (phase 3) | In-memory JSON | Phases 1-2 fields + hypothesis_tests + confirmed_root_cause | | Phase 3 summary or escalation block | Structured text output | Either confirmed root cause or BLOCKED escalation | ## Success Criteria | Criterion | Validation Method | |-----------|-------------------| | Maximum 3 hypotheses formed | Count of hypotheses array | | Each hypothesis cites evidence | evidence_supporting non-empty for each | | Each hypothesis tested with documented probe | test_performed field populated for each | | Strike counter maintained correctly | Count of unproductive consecutive failures | | Root cause confirmed with evidence chain OR escalation triggered | confirmed_root_cause present OR BLOCKED output | ## Error Handling | Scenario | Resolution | |----------|------------| | Evidence insufficient to form 3 hypotheses | Form as many as evidence supports (minimum 1), proceed | | Partial insight from rejected hypothesis | Do not count as strike; re-form or refine remaining hypotheses with new insight | | All 3 hypotheses confirmed simultaneously | Use highest-confidence confirmed one as root cause | | Hypothesis test requires production change | Prohibited — use static analysis or targeted read-only probe instead | ## Gate for Phase 4 Phase 4 can ONLY proceed if `confirmed_root_cause` is present. This is the Iron Law gate. | Outcome | Next Step | |---------|-----------| | Root cause confirmed | -> [Phase 4: Implementation](04-implementation.md) | | 3-strike escalation triggered | STOP — output diagnostic dump — STATUS: BLOCKED | | Partial insight, re-forming hypotheses | Stay in Phase 3, re-test with refined hypotheses | ## Next Phase -> [Phase 4: Implementation](04-implementation.md) ONLY with confirmed root cause.