Files
Claude-Code-Workflow/.codex/skills/team-lifecycle/phases/04-pipeline-coordination.md
catlog22 dd72e95e4d feat: add templates for epics, product brief, and requirements PRD
- Created a new directory structure for epics and stories with templates for individual epics and an index file.
- Added a product brief template for generating product brief documents in Phase 2.
- Introduced a requirements PRD template for generating a Product Requirements Document as a directory of individual requirement files in Phase 3.

feat: implement V2PipelineTab component for Memory V2 management

- Developed the V2PipelineTab component to manage extraction and consolidation processes.
- Included ExtractionCard and ConsolidationCard components to handle respective functionalities.
- Added JobsList component to display job statuses and allow filtering by job kind.

feat: create hooks for Memory V2 pipeline

- Implemented custom hooks for managing extraction and consolidation statuses, as well as job listings.
- Added mutation hooks to trigger extraction and consolidation processes with automatic query invalidation on success.
2026-02-27 13:27:27 +08:00

22 KiB

Phase 4: Pipeline Coordination

COMPACT PROTECTION: This is an execution document. After context compression, phase instructions become summaries only. You MUST immediately re-read this file via Read("~/.codex/skills/team-lifecycle/phases/04-pipeline-coordination.md") before continuing. Never execute based on summaries.

Objective

Execute the main spawn/wait/close coordination loop. This is the core phase where the orchestrator spawns agents for pipeline tasks, waits for results, processes outputs (including consensus severity routing), handles checkpoints, and advances the pipeline through fast-advance or sequential beats until all tasks complete.


Input

Input Source Required
sessionDir Phase 2/3 output Yes
state team-session.json (current) Yes
state.pipeline Task chain from Phase 3 Yes
state.mode Pipeline mode Yes
state.execution sequential or parallel Yes

Constants

Constant Value Description
SPEC_AGENT_TIMEOUT 900000 (15 min) Timeout for spec pipeline agents (analyst, writer, reviewer)
IMPL_AGENT_TIMEOUT 1800000 (30 min) Timeout for impl pipeline agents (executor, tester, planner)
CONVERGENCE_WAIT 120000 (2 min) Additional wait after sending convergence request
MAX_RETRIES_PER_TASK 3 Maximum retries for a failing task before escalation
MAX_GC_ROUNDS 2 Maximum QA-FE fix-retest iterations
ORPHAN_THRESHOLD 300000 (5 min) Time before orphaned in_progress task is reset

Execution Steps

Step 4.1: Compute Ready Tasks

Read the state file and find all tasks that can be started.

// Read current state
const state = JSON.parse(Read("<session-dir>/team-session.json"))

// Compute sets
const completedIds = state.pipeline
  .filter(t => t.status === "completed")
  .map(t => t.id)

const inProgressIds = state.pipeline
  .filter(t => t.status === "in_progress")
  .map(t => t.id)

const readyTasks = state.pipeline.filter(t =>
  t.status === "pending"
  && t.blocked_by.every(dep => completedIds.includes(dep))
)

Decision table (what to do with ready tasks):

Ready Count In-Progress Count Action
0 >0 Wait for running agents (enter wait loop)
0 0 Pipeline complete -> proceed to Phase 5
1 0 Spawn single agent, enter wait loop
1+ 0 Spawn all ready agents (parallel or sequential per config), enter wait loop
1+ >0 Spawn ready agents alongside running, enter wait loop

Step 4.2: Checkpoint Check

Before spawning, check if a checkpoint was just reached.

// Check if the most recently completed task has is_checkpoint_after = true
const justCompleted = state.pipeline.filter(t => t.status === "completed")
const lastCompleted = justCompleted[justCompleted.length - 1]

if (lastCompleted && lastCompleted.is_checkpoint_after
    && !state.checkpoints_hit.includes(lastCompleted.id)) {
  // Checkpoint reached
  state.checkpoints_hit.push(lastCompleted.id)
  // Write state
  // Output checkpoint message and pause
}

Checkpoint behavior:

Checkpoint Task Message Resume Condition
QUALITY-001 "SPEC PHASE COMPLETE. Review spec artifacts before proceeding to implementation. Type 'resume' to continue." User types 'resume'
DISCUSS-006 HIGH "Final sign-off blocked with HIGH severity. Review divergences and decide. Type 'resume' to proceed or 'revise' to create revision." User command

When paused at checkpoint:

  1. Update state: status = "paused"
  2. Write state file
  3. Output checkpoint message
  4. Yield -- orchestrator stops, waits for user

When user resumes from checkpoint:

  1. Update state: status = "active"
  2. Proceed to spawn ready tasks

Step 4.3: Spawn Agents

For each ready task, spawn an agent using the agent spawn template.

const spawnedAgents = []

for (const task of readyTasks) {
  // Determine timeout based on pipeline phase
  const timeout = isSpecTask(task.id) ? SPEC_AGENT_TIMEOUT : IMPL_AGENT_TIMEOUT

  // Build predecessor context
  const predecessorContext = task.blocked_by.map(depId => {
    const depTask = state.pipeline.find(t => t.id === depId)
    return `${depId}: ${depTask.artifact_path || "(no artifact)"}`
  }).join("\n")

  // Spawn agent
  const agentId = spawn_agent({
    message: `
## TASK ASSIGNMENT

### MANDATORY FIRST STEPS (Agent Execute)
1. **Read role definition**: ~/.codex/agents/${task.owner}.md (MUST read first)
2. Read session state: ${sessionDir}/team-session.json
3. Read wisdom files: ${sessionDir}/wisdom/*.md (if exists)

---

## Session
Session directory: ${sessionDir}
Task ID: ${task.id}
Pipeline mode: ${state.mode}

## Scope
${state.scope}

## Task
${task.description}

## InlineDiscuss
${task.inline_discuss || "none"}

## Dependencies (completed predecessors)
${predecessorContext || "(none - this is the first task)"}

## Constraints
- Only process ${task.id} (prefix: ${task.id.split('-')[0]}-*)
- All output prefixed with [${task.owner}] tag
- Write artifacts to ${sessionDir}/${getArtifactSubdir(task)}
- Read wisdom files for cross-task knowledge before starting
- After completion, append discoveries to wisdom files
- If InlineDiscuss is set, spawn discuss subagent after primary artifact creation
  (Read ~/.codex/agents/discuss-agent.md for protocol)
- If codebase exploration is needed, spawn explore subagent
  (Read ~/.codex/agents/explore-agent.md for protocol)

## Completion Protocol
When work is complete, output EXACTLY:

TASK_COMPLETE:
- task_id: ${task.id}
- status: <success | failed | partial>
- artifact: <path-to-primary-artifact>
- discuss_verdict: <consensus_reached | consensus_blocked | none>
- discuss_severity: <HIGH | MEDIUM | LOW | none>
- summary: <one-line summary>
`
  })

  // Record in state
  task.status = "in_progress"
  task.agent_id = agentId
  task.started_at = new Date().toISOString()

  spawnedAgents.push({ agentId, taskId: task.id, owner: task.owner, timeout })

  state.active_agents.push({
    agent_id: agentId,
    task_id: task.id,
    owner: task.owner,
    spawned_at: new Date().toISOString()
  })
}

// Update state file
state.updated_at = new Date().toISOString()
Write("<session-dir>/team-session.json", JSON.stringify(state, null, 2))

Agent-to-artifact-subdirectory mapping (getArtifactSubdir):

Agent / Task Prefix Subdirectory
analyst / RESEARCH-* spec/
writer / DRAFT-* spec/
reviewer / QUALITY-* spec/
planner / PLAN-* plan/
executor / IMPL-* (project root - code changes)
tester / TEST-* qa/
reviewer / REVIEW-* qa/
architect / ARCH-* architecture/
fe-developer / DEV-FE-* (project root - code changes)
fe-qa / QA-FE-* qa/

Step 4.4: Wait Loop

Enter the main wait loop. Wait for all spawned agents, then process results.

// Collect all active agent IDs
const activeAgentIds = spawnedAgents.map(a => a.agentId)

// Determine timeout (use the maximum among spawned agents)
const maxTimeout = Math.max(...spawnedAgents.map(a => a.timeout))

// Wait for all agents
const results = wait({ ids: activeAgentIds, timeout_ms: maxTimeout })

Step 4.5: Timeout Handling

If any agent timed out, send convergence request and retry.

if (results.timed_out) {
  for (const agent of spawnedAgents) {
    if (!results.status[agent.agentId]?.completed) {
      // Send convergence request
      send_input({
        id: agent.agentId,
        message: `
## TIMEOUT NOTIFICATION

Execution timeout reached for task ${agent.taskId}. Please:
1. Save all current progress to artifact files
2. Output TASK_COMPLETE with status: partial
3. Include summary of completed vs remaining work
`
      })
    }
  }

  // Wait additional convergence period
  const convergenceResults = wait({
    ids: activeAgentIds.filter(id => !results.status[id]?.completed),
    timeout_ms: CONVERGENCE_WAIT
  })

  // Merge results
  // Agents still not complete after convergence -> force close
  for (const agent of spawnedAgents) {
    const agentResult = results.status[agent.agentId]?.completed
      || convergenceResults.status?.[agent.agentId]?.completed

    if (!agentResult) {
      // Force close and mark as failed
      close_agent({ id: agent.agentId })
      handleTaskFailure(agent.taskId, "timeout after convergence request")
    }
  }
}

Step 4.6: Process Agent Results

For each completed agent, extract TASK_COMPLETE data and update state.

for (const agent of spawnedAgents) {
  const output = results.status[agent.agentId]?.completed
  if (!output) continue  // handled in timeout section

  // Parse TASK_COMPLETE from output
  const taskResult = parseTaskComplete(output)

  if (!taskResult) {
    // Malformed output - treat as partial success
    handleMalformedOutput(agent.taskId, output)
    continue
  }

  // Update task in state
  const task = state.pipeline.find(t => t.id === agent.taskId)
  task.status = taskResult.status === "failed" ? "failed" : "completed"
  task.artifact_path = taskResult.artifact
  task.discuss_verdict = taskResult.discuss_verdict
  task.discuss_severity = taskResult.discuss_severity
  task.completed_at = new Date().toISOString()

  // Remove from active agents
  state.active_agents = state.active_agents.filter(
    a => a.agent_id !== agent.agentId
  )

  // Add to completed
  if (task.status === "completed") {
    state.completed_tasks.push(task.id)
    state.tasks_completed++
  }

  // Close agent
  close_agent({ id: agent.agentId })

  // Route by consensus verdict
  if (taskResult.discuss_verdict === "consensus_blocked") {
    handleConsensusBlocked(task, taskResult)
  }
}

// Write updated state
state.updated_at = new Date().toISOString()
Write("<session-dir>/team-session.json", JSON.stringify(state, null, 2))

Step 4.7: Consensus Severity Routing

Handle consensus_blocked results from agents with inline discuss.

function handleConsensusBlocked(task, taskResult) {
  const severity = taskResult.discuss_severity

  switch (severity) {
    case "LOW":
      // Treat as consensus_reached with notes
      // No special action needed, proceed normally
      break

    case "MEDIUM":
      // Log warning to wisdom/issues.md
      appendToFile("<session-dir>/wisdom/issues.md",
        `\n## ${task.id} - Consensus Warning (MEDIUM)\n`
        + `Divergences: ${taskResult.divergences || "see discussion record"}\n`
        + `Action items: ${taskResult.action_items || "see discussion record"}\n`)
      // Proceed normally
      break

    case "HIGH":
      // Check if this is DISCUSS-006 (final sign-off)
      if (task.inline_discuss === "DISCUSS-006") {
        // Always pause for user decision
        state.status = "paused"
        state.checkpoints_hit.push(`${task.id}-DISCUSS-006-HIGH`)
        // Output will include pause message
        return  // Do not advance pipeline
      }

      // Check revision limit
      if (task.revision_count >= 1) {
        // Already revised once, escalate to user
        state.status = "paused"
        // Output escalation message
        return
      }

      // Create revision task
      const revisionTask = {
        id: `${task.id}-R1`,
        owner: task.owner,
        status: "pending",
        blocked_by: [],
        description: `Revision of ${task.id}: address consensus-blocked divergences.\n`
          + `Session: ${sessionDir}\n`
          + `Original artifact: ${task.artifact_path}\n`
          + `Divergences: ${taskResult.divergences || "see discussion record"}\n`
          + `Action items: ${taskResult.action_items || "see discussion record"}\n`
          + `InlineDiscuss: ${task.inline_discuss}`,
        inline_discuss: task.inline_discuss,
        agent_id: null,
        artifact_path: null,
        discuss_verdict: null,
        discuss_severity: null,
        started_at: null,
        completed_at: null,
        revision_of: task.id,
        revision_count: task.revision_count + 1,
        is_checkpoint_after: task.is_checkpoint_after
      }

      // Insert revision into pipeline (after the original task)
      const taskIndex = state.pipeline.findIndex(t => t.id === task.id)
      state.pipeline.splice(taskIndex + 1, 0, revisionTask)
      state.tasks_total++

      // Update dependent tasks: anything blocked_by task.id should now be blocked_by revision.id
      for (const t of state.pipeline) {
        const depIdx = t.blocked_by.indexOf(task.id)
        if (depIdx !== -1) {
          t.blocked_by[depIdx] = revisionTask.id
        }
      }

      // Track revision chain
      state.revision_chains[task.id] = revisionTask.id
      break
  }
}

Step 4.8: GC Loop Handling (Frontend QA)

When QA-FE agent completes with verdict NEEDS_FIX:

function handleGCLoop(qaTask, taskResult) {
  if (state.gc_loop_count >= MAX_GC_ROUNDS) {
    // Max rounds reached, stop loop
    // Output: "QA-FE max rounds reached. Latest report: <artifact-path>"
    return
  }

  state.gc_loop_count++
  const round = state.gc_loop_count + 1

  // Create DEV-FE-NNN (fix task)
  const devFeTask = {
    id: `DEV-FE-${String(round).padStart(3, '0')}`,
    owner: "fe-developer",
    status: "pending",
    blocked_by: [qaTask.id],
    description: `Frontend fix round ${round}: address QA findings.\n`
      + `Session: ${sessionDir}\n`
      + `QA Report: ${qaTask.artifact_path}\n`
      + `Scope: ${state.scope}`,
    inline_discuss: null,
    agent_id: null,
    artifact_path: null,
    discuss_verdict: null,
    discuss_severity: null,
    started_at: null,
    completed_at: null,
    revision_of: null,
    revision_count: 0,
    is_checkpoint_after: false
  }

  // Create QA-FE-NNN (retest task)
  const qaFeTask = {
    id: `QA-FE-${String(round).padStart(3, '0')}`,
    owner: "fe-qa",
    status: "pending",
    blocked_by: [devFeTask.id],
    description: `Frontend QA round ${round}: retest after fixes.\n`
      + `Session: ${sessionDir}\n`
      + `Scope: ${state.scope}`,
    inline_discuss: null,
    agent_id: null,
    artifact_path: null,
    discuss_verdict: null,
    discuss_severity: null,
    started_at: null,
    completed_at: null,
    revision_of: null,
    revision_count: 0,
    is_checkpoint_after: false
  }

  // Add to pipeline
  state.pipeline.push(devFeTask, qaFeTask)
  state.tasks_total += 2

  // Update downstream dependencies (e.g., REVIEW-001 blocked_by QA-FE-001 -> QA-FE-NNN)
  for (const t of state.pipeline) {
    const depIdx = t.blocked_by.indexOf(qaTask.id)
    if (depIdx !== -1 && t.id !== devFeTask.id) {
      t.blocked_by[depIdx] = qaFeTask.id
    }
  }
}

Step 4.9: Fast-Advance Check

After processing all results, determine whether to fast-advance or yield.

// Recompute ready tasks after processing
const newReadyTasks = state.pipeline.filter(t =>
  t.status === "pending"
  && t.blocked_by.every(dep =>
    state.pipeline.find(p => p.id === dep)?.status === "completed"
  )
)

const stillRunning = state.active_agents.length > 0

Fast-advance decision table:

Ready Count Still Running Checkpoint Pending Action
0 Yes No Yield, wait for running agents
0 No No Pipeline complete -> Phase 5
0 No Yes Paused at checkpoint, yield
1 No No Fast-advance: spawn immediately, re-enter Step 4.3
1 Yes No Spawn alongside running, re-enter wait loop
2+ Any No Spawn all ready (batch), re-enter wait loop
Any Any Yes Paused at checkpoint, yield
if (state.status === "paused") {
  // Checkpoint or escalation - yield
  // Output pause message
  return
}

if (newReadyTasks.length === 0 && state.active_agents.length === 0) {
  // Pipeline complete
  // Proceed to Phase 5
  return "PIPELINE_COMPLETE"
}

if (newReadyTasks.length > 0) {
  // Fast-advance: loop back to Step 4.3 with new ready tasks
  readyTasks = newReadyTasks
  // Continue to Step 4.3 (spawn agents)
}

if (newReadyTasks.length === 0 && state.active_agents.length > 0) {
  // Wait for running agents
  // Re-enter Step 4.4 (wait loop)
}

Step 4.10: Orphan Detection

When processing a resume command or between beats, check for orphaned tasks.

function detectOrphans(state) {
  const now = Date.now()

  for (const task of state.pipeline) {
    if (task.status !== "in_progress") continue

    // Check if there is an active agent for this task
    const hasAgent = state.active_agents.some(a => a.task_id === task.id)

    if (!hasAgent) {
      // Task is in_progress but no agent is tracking it
      const elapsed = now - new Date(task.started_at).getTime()

      if (elapsed > ORPHAN_THRESHOLD) {
        // Orphaned task (likely fast-advance failure)
        task.status = "pending"
        task.agent_id = null
        task.started_at = null

        // Log to wisdom
        appendToFile("<session-dir>/wisdom/issues.md",
          `\n## Orphaned Task Reset: ${task.id}\n`
          + `Task was in_progress for ${Math.round(elapsed / 1000)}s with no active agent. Reset to pending.\n`)
      }
    }
  }
}

Step 4.11: Status Output

After each beat cycle, output a status summary.

[orchestrator] Beat complete
  Completed this beat: <task-ids>
  Still running: <task-ids> (<agent-roles>)
  Ready to spawn: <task-ids>
  Progress: <completed>/<total> (<percent>%)
  Next action: <spawning | waiting | checkpoint-paused | pipeline-complete>

Step 4.12: User Command Handling

During Phase 4, user may issue commands.

check / status:

Read state file -> output status graph (see orchestrator.md User Commands) -> yield
No pipeline advancement.

resume / continue:

Read state file
  +- status === "paused"?
  |   +- YES -> update status to "active" -> detectOrphans -> Step 4.1 (compute ready)
  |   +- NO  -> detectOrphans -> Step 4.1 (compute ready)

Pipeline Execution Patterns

Spec-only (6 linear beats)

// Beat 1: RESEARCH-001 (analyst)
analyst = spawn_agent(analystPrompt)
result = wait([analyst]) -> close_agent(analyst)
// Process discuss result (DISCUSS-001)
// Fast-advance to beat 2

// Beat 2: DRAFT-001 (writer)
writer1 = spawn_agent(writerPromptForDRAFT001)
result = wait([writer1]) -> close_agent(writer1)
// Process discuss result (DISCUSS-002)
// Fast-advance to beat 3

// Beats 3-5: DRAFT-002, DRAFT-003, DRAFT-004 (same pattern)
// Each: spawn writer -> wait -> close -> process discuss -> fast-advance

// Beat 6: QUALITY-001 (reviewer)
reviewer = spawn_agent(reviewerPromptForQUALITY001)
result = wait([reviewer]) -> close_agent(reviewer)
// Process discuss result (DISCUSS-006)
// If full-lifecycle: CHECKPOINT -> pause for user
// If spec-only: PIPELINE_COMPLETE -> Phase 5

Impl-only (3 beats with parallel window)

// Beat 1: PLAN-001 (planner)
planner = spawn_agent(plannerPrompt)
result = wait([planner]) -> close_agent(planner)

// Beat 2: IMPL-001 (executor)
executor = spawn_agent(executorPrompt)
result = wait([executor]) -> close_agent(executor)

// Beat 3: TEST-001 || REVIEW-001 (parallel)
tester = spawn_agent(testerPrompt)
reviewer = spawn_agent(reviewerPrompt)
results = wait([tester, reviewer])  // batch wait
close_agent(tester)
close_agent(reviewer)
// PIPELINE_COMPLETE -> Phase 5

Fullstack (4 beats with dual parallel)

// Beat 1: PLAN-001
planner = spawn_agent(plannerPrompt)
result = wait([planner]) -> close_agent(planner)

// Beat 2: IMPL-001 || DEV-FE-001 (parallel)
executor = spawn_agent(executorPrompt)
feDev = spawn_agent(feDevPrompt)
results = wait([executor, feDev])
close_agent(executor)
close_agent(feDev)

// Beat 3: TEST-001 || QA-FE-001 (parallel)
tester = spawn_agent(testerPrompt)
feQa = spawn_agent(feQaPrompt)
results = wait([tester, feQa])
close_agent(tester)
close_agent(feQa)
// Handle GC loop if QA-FE verdict is NEEDS_FIX

// Beat 4: REVIEW-001 (sync barrier)
reviewer = spawn_agent(reviewerPrompt)
result = wait([reviewer]) -> close_agent(reviewer)
// PIPELINE_COMPLETE -> Phase 5

Output

Output Type Destination
state (final) Object Updated in team-session.json throughout
Pipeline status String "PIPELINE_COMPLETE" or "PAUSED"
All task artifacts Files Written by agents to session subdirectories

Success Criteria

  • All pipeline tasks reach "completed" status (or "partial" with user acknowledgment)
  • State file reflects accurate pipeline status at all times
  • Checkpoints are properly enforced (spec-to-impl transition)
  • Consensus severity routing executed correctly
  • GC loop respects maximum round limit
  • No orphaned tasks at pipeline end
  • All agents properly closed

Error Handling

Error Detection Resolution
Agent timeout wait() returns timed_out Send convergence via send_input, wait 2 min, force close
Agent crash wait() returns error Reset task to pending, respawn (max 3 retries)
3+ retries on same task retry_count in task state Pause pipeline, report to user
Orphaned task in_progress + no agent + elapsed > 5 min Reset to pending, respawn
Malformed TASK_COMPLETE parseTaskComplete returns null Treat as partial, log warning
State file write conflict Write error Retry once, fail on second error
Pipeline stall No ready + no running + has pending Inspect blocked_by, report to user
DISCUSS-006 HIGH Parsed from reviewer output Always pause for user
Revision also blocked Revision task returns HIGH Pause, escalate to user
GC loop exceeded gc_loop_count >= MAX_GC_ROUNDS Stop loop, report QA state

Next Phase

When pipeline completes (all tasks done, no paused checkpoint), proceed to Phase 5: Completion Report.