mirror of https://github.com/catlog22/Claude-Code-Workflow.git synced 2026-03-06 16:31:12 +08:00

Files

catlog22 fb4f6e718e feat: Implement DeepWiki documentation generation tools

- Added `__init__.py` in `codexlens/tools` for documentation generation.
- Created `deepwiki_generator.py` to handle symbol extraction and markdown generation.
- Introduced `MockMarkdownGenerator` for testing purposes.
- Implemented `DeepWikiGenerator` class for managing documentation generation and file processing.
- Added unit tests for `DeepWikiStore` to ensure proper functionality and error handling.
- Created tests for DeepWiki TypeScript types matching.

2026-03-05 18:30:56 +08:00

75 KiB

Raw Blame History

Symphony Service Specification

Status: Draft v1 (language-agnostic)

Purpose: Define a service that orchestrates coding agents to get project work done.

1. Problem Statement

Symphony is a long-running automation service that continuously reads work from an issue tracker (Linear in this specification version), creates an isolated workspace for each issue, and runs a coding agent session for that issue inside the workspace.

The service solves four operational problems:

It turns issue execution into a repeatable daemon workflow instead of manual scripts.
It isolates agent execution in per-issue workspaces so agent commands run only inside per-issue workspace directories.
It keeps the workflow policy in-repo (WORKFLOW.md) so teams version the agent prompt and runtime settings with their code.
It provides enough observability to operate and debug multiple concurrent agent runs.

Implementations are expected to document their trust and safety posture explicitly. This specification does not require a single approval, sandbox, or operator-confirmation policy; some implementations may target trusted environments with a high-trust configuration, while others may require stricter approvals or sandboxing.

Important boundary:

Symphony is a scheduler/runner and tracker reader.
Ticket writes (state transitions, comments, PR links) are typically performed by the coding agent using tools available in the workflow/runtime environment.
A successful run may end at a workflow-defined handoff state (for example Human Review), not necessarily Done.

2. Goals and Non-Goals

2.1 Goals

Poll the issue tracker on a fixed cadence and dispatch work with bounded concurrency.
Maintain a single authoritative orchestrator state for dispatch, retries, and reconciliation.
Create deterministic per-issue workspaces and preserve them across runs.
Stop active runs when issue state changes make them ineligible.
Recover from transient failures with exponential backoff.
Load runtime behavior from a repository-owned WORKFLOW.md contract.
Expose operator-visible observability (at minimum structured logs).
Support restart recovery without requiring a persistent database.

2.2 Non-Goals

Rich web UI or multi-tenant control plane.
Prescribing a specific dashboard or terminal UI implementation.
General-purpose workflow engine or distributed job scheduler.
Built-in business logic for how to edit tickets, PRs, or comments. (That logic lives in the workflow prompt and agent tooling.)
Mandating strong sandbox controls beyond what the coding agent and host OS provide.
Mandating a single default approval, sandbox, or operator-confirmation posture for all implementations.

3. System Overview

3.1 Main Components

Workflow Loader
- Reads WORKFLOW.md.
- Parses YAML front matter and prompt body.
- Returns {config, prompt_template}.
Config Layer
- Exposes typed getters for workflow config values.
- Applies defaults and environment variable indirection.
- Performs validation used by the orchestrator before dispatch.
Issue Tracker Client
- Fetches candidate issues in active states.
- Fetches current states for specific issue IDs (reconciliation).
- Fetches terminal-state issues during startup cleanup.
- Normalizes tracker payloads into a stable issue model.
Orchestrator
- Owns the poll tick.
- Owns the in-memory runtime state.
- Decides which issues to dispatch, retry, stop, or release.
- Tracks session metrics and retry queue state.
Workspace Manager
- Maps issue identifiers to workspace paths.
- Ensures per-issue workspace directories exist.
- Runs workspace lifecycle hooks.
- Cleans workspaces for terminal issues.
Agent Runner
- Creates workspace.
- Builds prompt from issue + workflow template.
- Launches the coding agent app-server client.
- Streams agent updates back to the orchestrator.
Status Surface (optional)
- Presents human-readable runtime status (for example terminal output, dashboard, or other operator-facing view).
Logging
- Emits structured runtime logs to one or more configured sinks.

3.2 Abstraction Levels

Symphony is easiest to port when kept in these layers:

Policy Layer (repo-defined)
- WORKFLOW.md prompt body.
- Team-specific rules for ticket handling, validation, and handoff.
Configuration Layer (typed getters)
- Parses front matter into typed runtime settings.
- Handles defaults, environment tokens, and path normalization.
Coordination Layer (orchestrator)
- Polling loop, issue eligibility, concurrency, retries, reconciliation.
Execution Layer (workspace + agent subprocess)
- Filesystem lifecycle, workspace preparation, coding-agent protocol.
Integration Layer (Linear adapter)
- API calls and normalization for tracker data.
Observability Layer (logs + optional status surface)
- Operator visibility into orchestrator and agent behavior.

3.3 External Dependencies

Issue tracker API (Linear for tracker.kind: linear in this specification version).
Local filesystem for workspaces and logs.
Optional workspace population tooling (for example Git CLI, if used).
Coding-agent executable that supports JSON-RPC-like app-server mode over stdio.
Host environment authentication for the issue tracker and coding agent.

4. Core Domain Model

4.1 Entities

4.1.1 Issue

Normalized issue record used by orchestration, prompt rendering, and observability output.

Fields:

id (string)
- Stable tracker-internal ID.
identifier (string)
- Human-readable ticket key (example: ABC-123).
title (string)
description (string or null)
priority (integer or null)
- Lower numbers are higher priority in dispatch sorting.
state (string)
- Current tracker state name.
branch_name (string or null)
- Tracker-provided branch metadata if available.
url (string or null)
labels (list of strings)
- Normalized to lowercase.
blocked_by (list of blocker refs)
- Each blocker ref contains:
  - id (string or null)
  - identifier (string or null)
  - state (string or null)
created_at (timestamp or null)
updated_at (timestamp or null)

4.1.2 Workflow Definition

Parsed WORKFLOW.md payload:

config (map)
- YAML front matter root object.
prompt_template (string)
- Markdown body after front matter, trimmed.

4.1.3 Service Config (Typed View)

Typed runtime values derived from WorkflowDefinition.config plus environment resolution.

Examples:

poll interval
workspace root
active and terminal issue states
concurrency limits
coding-agent executable/args/timeouts
workspace hooks

4.1.4 Workspace

Filesystem workspace assigned to one issue identifier.

Fields (logical):

path (workspace path; current runtime typically uses absolute paths, but relative roots are possible if configured without path separators)
workspace_key (sanitized issue identifier)
created_now (boolean, used to gate after_create hook)

4.1.5 Run Attempt

One execution attempt for one issue.

Fields (logical):

issue_id
issue_identifier
attempt (integer or null, null for first run, >=1 for retries/continuation)
workspace_path
started_at
status
error (optional)

4.1.6 Live Session (Agent Session Metadata)

State tracked while a coding-agent subprocess is running.

Fields:

session_id (string, <thread_id>-<turn_id>)
thread_id (string)
turn_id (string)
codex_app_server_pid (string or null)
last_codex_event (string/enum or null)
last_codex_timestamp (timestamp or null)
last_codex_message (summarized payload)
codex_input_tokens (integer)
codex_output_tokens (integer)
codex_total_tokens (integer)
last_reported_input_tokens (integer)
last_reported_output_tokens (integer)
last_reported_total_tokens (integer)
turn_count (integer)
- Number of coding-agent turns started within the current worker lifetime.

4.1.7 Retry Entry

Scheduled retry state for an issue.

Fields:

issue_id
identifier (best-effort human ID for status surfaces/logs)
attempt (integer, 1-based for retry queue)
due_at_ms (monotonic clock timestamp)
timer_handle (runtime-specific timer reference)
error (string or null)

4.1.8 Orchestrator Runtime State

Single authoritative in-memory state owned by the orchestrator.

Fields:

poll_interval_ms (current effective poll interval)
max_concurrent_agents (current effective global concurrency limit)
running (map issue_id -> running entry)
claimed (set of issue IDs reserved/running/retrying)
retry_attempts (map issue_id -> RetryEntry)
completed (set of issue IDs; bookkeeping only, not dispatch gating)
codex_totals (aggregate tokens + runtime seconds)
codex_rate_limits (latest rate-limit snapshot from agent events)

4.2 Stable Identifiers and Normalization Rules

Issue ID
- Use for tracker lookups and internal map keys.
Issue Identifier
- Use for human-readable logs and workspace naming.
Workspace Key
- Derive from issue.identifier by replacing any character not in [A-Za-z0-9._-] with _.
- Use the sanitized value for the workspace directory name.
Normalized Issue State
- Compare states after trim + lowercase.
Session ID
- Compose from coding-agent thread_id and turn_id as <thread_id>-<turn_id>.

5. Workflow Specification (Repository Contract)

5.1 File Discovery and Path Resolution

Workflow file path precedence:

Explicit application/runtime setting (set by CLI startup path).
Default: WORKFLOW.md in the current process working directory.

Loader behavior:

If the file cannot be read, return missing_workflow_file error.
The workflow file is expected to be repository-owned and version-controlled.

5.2 File Format

WORKFLOW.md is a Markdown file with optional YAML front matter.

Design note:

WORKFLOW.md should be self-contained enough to describe and run different workflows (prompt, runtime settings, hooks, and tracker selection/config) without requiring out-of-band service-specific configuration.

Parsing rules:

If file starts with ---, parse lines until the next --- as YAML front matter.
Remaining lines become the prompt body.
If front matter is absent, treat the entire file as prompt body and use an empty config map.
YAML front matter must decode to a map/object; non-map YAML is an error.
Prompt body is trimmed before use.

Returned workflow object:

config: front matter root object (not nested under a config key).
prompt_template: trimmed Markdown body.

5.3 Front Matter Schema

Top-level keys:

tracker
polling
workspace
hooks
agent
codex

Unknown keys should be ignored for forward compatibility.

Note:

The workflow front matter is extensible. Optional extensions may define additional top-level keys (for example server) without changing the core schema above.
Extensions should document their field schema, defaults, validation rules, and whether changes apply dynamically or require restart.
Common extension: server.port (integer) enables the optional HTTP server described in Section 13.7.

5.3.1 `tracker` (object)

Fields:

kind (string)
- Required for dispatch.
- Current supported value: linear
endpoint (string)
- Default for tracker.kind == "linear": https://api.linear.app/graphql
api_key (string)
- May be a literal token or $VAR_NAME.
- Canonical environment variable for tracker.kind == "linear": LINEAR_API_KEY.
- If $VAR_NAME resolves to an empty string, treat the key as missing.
project_slug (string)
- Required for dispatch when tracker.kind == "linear".
active_states (list of strings or comma-separated string)
- Default: Todo, In Progress
terminal_states (list of strings or comma-separated string)
- Default: Closed, Cancelled, Canceled, Duplicate, Done

5.3.2 `polling` (object)

Fields:

interval_ms (integer or string integer)
- Default: 30000
- Changes should be re-applied at runtime and affect future tick scheduling without restart.

5.3.3 `workspace` (object)

Fields:

root (path string or $VAR)
- Default: <system-temp>/symphony_workspaces
- ~ and strings containing path separators are expanded.
- Bare strings without path separators are preserved as-is (relative roots are allowed but discouraged).

5.3.4 `hooks` (object)

Fields:

after_create (multiline shell script string, optional)
- Runs only when a workspace directory is newly created.
- Failure aborts workspace creation.
before_run (multiline shell script string, optional)
- Runs before each agent attempt after workspace preparation and before launching the coding agent.
- Failure aborts the current attempt.
after_run (multiline shell script string, optional)
- Runs after each agent attempt (success, failure, timeout, or cancellation) once the workspace exists.
- Failure is logged but ignored.
before_remove (multiline shell script string, optional)
- Runs before workspace deletion if the directory exists.
- Failure is logged but ignored; cleanup still proceeds.
timeout_ms (integer, optional)
- Default: 60000
- Applies to all workspace hooks.
- Non-positive values should be treated as invalid and fall back to the default.
- Changes should be re-applied at runtime for future hook executions.

5.3.5 `agent` (object)

Fields:

max_concurrent_agents (integer or string integer)
- Default: 10
- Changes should be re-applied at runtime and affect subsequent dispatch decisions.
max_retry_backoff_ms (integer or string integer)
- Default: 300000 (5 minutes)
- Changes should be re-applied at runtime and affect future retry scheduling.
max_concurrent_agents_by_state (map state_name -> positive integer)
- Default: empty map.
- State keys are normalized (trim + lowercase) for lookup.
- Invalid entries (non-positive or non-numeric) are ignored.

5.3.6 `codex` (object)

Fields:

For Codex-owned config values such as approval_policy, thread_sandbox, and turn_sandbox_policy, supported values are defined by the targeted Codex app-server version. Implementors should treat them as pass-through Codex config values rather than relying on a hand-maintained enum in this spec. To inspect the installed Codex schema, run codex app-server generate-json-schema --out <dir> and inspect the relevant definitions referenced by v2/ThreadStartParams.json and v2/TurnStartParams.json. Implementations may validate these fields locally if they want stricter startup checks.

command (string shell command)
- Default: codex app-server
- The runtime launches this command via bash -lc in the workspace directory.
- The launched process must speak a compatible app-server protocol over stdio.
approval_policy (Codex AskForApproval value)
- Default: implementation-defined.
thread_sandbox (Codex SandboxMode value)
- Default: implementation-defined.
turn_sandbox_policy (Codex SandboxPolicy value)
- Default: implementation-defined.
turn_timeout_ms (integer)
- Default: 3600000 (1 hour)
read_timeout_ms (integer)
- Default: 5000
stall_timeout_ms (integer)
- Default: 300000 (5 minutes)
- If <= 0, stall detection is disabled.

5.4 Prompt Template Contract

The Markdown body of WORKFLOW.md is the per-issue prompt template.

Rendering requirements:

Use a strict template engine (Liquid-compatible semantics are sufficient).
Unknown variables must fail rendering.
Unknown filters must fail rendering.

Template input variables:

issue (object)
- Includes all normalized issue fields, including labels and blockers.
attempt (integer or null)
- null/absent on first attempt.
- Integer on retry or continuation run.

Fallback prompt behavior:

If the workflow prompt body is empty, the runtime may use a minimal default prompt (You are working on an issue from Linear.).
Workflow file read/parse failures are configuration/validation errors and should not silently fall back to a prompt.

5.5 Workflow Validation and Error Surface

Error classes:

missing_workflow_file
workflow_parse_error
workflow_front_matter_not_a_map
template_parse_error (during prompt rendering)
template_render_error (unknown variable/filter, invalid interpolation)

Dispatch gating behavior:

Workflow file read/YAML errors block new dispatches until fixed.
Template errors fail only the affected run attempt.

6. Configuration Specification

6.1 Source Precedence and Resolution Semantics

Configuration precedence:

Workflow file path selection (runtime setting -> cwd default).
YAML front matter values.
Environment indirection via $VAR_NAME inside selected YAML values.
Built-in defaults.

Value coercion semantics:

Path/command fields support:
- ~ home expansion
- $VAR expansion for env-backed path values
- Apply expansion only to values intended to be local filesystem paths; do not rewrite URIs or arbitrary shell command strings.

6.2 Dynamic Reload Semantics

Dynamic reload is required:

The software should watch WORKFLOW.md for changes.
On change, it should re-read and re-apply workflow config and prompt template without restart.
The software should attempt to adjust live behavior to the new config (for example polling cadence, concurrency limits, active/terminal states, codex settings, workspace paths/hooks, and prompt content for future runs).
Reloaded config applies to future dispatch, retry scheduling, reconciliation decisions, hook execution, and agent launches.
Implementations are not required to restart in-flight agent sessions automatically when config changes.
Extensions that manage their own listeners/resources (for example an HTTP server port change) may require restart unless the implementation explicitly supports live rebind.
Implementations should also re-validate/reload defensively during runtime operations (for example before dispatch) in case filesystem watch events are missed.
Invalid reloads should not crash the service; keep operating with the last known good effective configuration and emit an operator-visible error.

6.3 Dispatch Preflight Validation

This validation is a scheduler preflight run before attempting to dispatch new work. It validates the workflow/config needed to poll and launch workers, not a full audit of all possible workflow behavior.

Startup validation:

Validate configuration before starting the scheduling loop.
If startup validation fails, fail startup and emit an operator-visible error.

Per-tick dispatch validation:

Re-validate before each dispatch cycle.
If validation fails, skip dispatch for that tick, keep reconciliation active, and emit an operator-visible error.

Validation checks:

Workflow file can be loaded and parsed.
tracker.kind is present and supported.
tracker.api_key is present after $ resolution.
tracker.project_slug is present when required by the selected tracker kind.
codex.command is present and non-empty.

6.4 Config Fields Summary (Cheat Sheet)

This section is intentionally redundant so a coding agent can implement the config layer quickly.

tracker.kind: string, required, currently linear
tracker.endpoint: string, default https://api.linear.app/graphql when tracker.kind=linear
tracker.api_key: string or $VAR, canonical env LINEAR_API_KEY when tracker.kind=linear
tracker.project_slug: string, required when tracker.kind=linear
tracker.active_states: list/string, default Todo, In Progress
tracker.terminal_states: list/string, default Closed, Cancelled, Canceled, Duplicate, Done
polling.interval_ms: integer, default 30000
workspace.root: path, default <system-temp>/symphony_workspaces
hooks.after_create: shell script or null
hooks.before_run: shell script or null
hooks.after_run: shell script or null
hooks.before_remove: shell script or null
hooks.timeout_ms: integer, default 60000
agent.max_concurrent_agents: integer, default 10
agent.max_turns: integer, default 20
agent.max_retry_backoff_ms: integer, default 300000 (5m)
agent.max_concurrent_agents_by_state: map of positive integers, default {}
codex.command: shell command string, default codex app-server
codex.approval_policy: Codex AskForApproval value, default implementation-defined
codex.thread_sandbox: Codex SandboxMode value, default implementation-defined
codex.turn_sandbox_policy: Codex SandboxPolicy value, default implementation-defined
codex.turn_timeout_ms: integer, default 3600000
codex.read_timeout_ms: integer, default 5000
codex.stall_timeout_ms: integer, default 300000
server.port (extension): integer, optional; enables the optional HTTP server, 0 may be used for ephemeral local bind, and CLI --port overrides it

7. Orchestration State Machine

The orchestrator is the only component that mutates scheduling state. All worker outcomes are reported back to it and converted into explicit state transitions.

7.1 Issue Orchestration States

This is not the same as tracker states (Todo, In Progress, etc.). This is the service's internal claim state.

Unclaimed
- Issue is not running and has no retry scheduled.
Claimed
- Orchestrator has reserved the issue to prevent duplicate dispatch.
- In practice, claimed issues are either Running or RetryQueued.
Running
- Worker task exists and the issue is tracked in running map.
RetryQueued
- Worker is not running, but a retry timer exists in retry_attempts.
Released
- Claim removed because issue is terminal, non-active, missing, or retry path completed without re-dispatch.

Important nuance:

A successful worker exit does not mean the issue is done forever.
The worker may continue through multiple back-to-back coding-agent turns before it exits.
After each normal turn completion, the worker re-checks the tracker issue state.
If the issue is still in an active state, the worker should start another turn on the same live coding-agent thread in the same workspace, up to agent.max_turns.
The first turn should use the full rendered task prompt.
Continuation turns should send only continuation guidance to the existing thread, not resend the original task prompt that is already present in thread history.
Once the worker exits normally, the orchestrator still schedules a short continuation retry (about 1 second) so it can re-check whether the issue remains active and needs another worker session.

7.2 Run Attempt Lifecycle

A run attempt transitions through these phases:

PreparingWorkspace
BuildingPrompt
LaunchingAgentProcess
InitializingSession
StreamingTurn
Finishing
Succeeded
Failed
TimedOut
Stalled
CanceledByReconciliation

Distinct terminal reasons are important because retry logic and logs differ.

7.3 Transition Triggers

Poll Tick
- Reconcile active runs.
- Validate config.
- Fetch candidate issues.
- Dispatch until slots are exhausted.
Worker Exit (normal)
- Remove running entry.
- Update aggregate runtime totals.
- Schedule continuation retry (attempt 1) after the worker exhausts or finishes its in-process turn loop.
Worker Exit (abnormal)
- Remove running entry.
- Update aggregate runtime totals.
- Schedule exponential-backoff retry.
Codex Update Event
- Update live session fields, token counters, and rate limits.
Retry Timer Fired
- Re-fetch active candidates and attempt re-dispatch, or release claim if no longer eligible.
Reconciliation State Refresh
- Stop runs whose issue states are terminal or no longer active.
Stall Timeout
- Kill worker and schedule retry.

7.4 Idempotency and Recovery Rules

The orchestrator serializes state mutations through one authority to avoid duplicate dispatch.
claimed and running checks are required before launching any worker.
Reconciliation runs before dispatch on every tick.
Restart recovery is tracker-driven and filesystem-driven (no durable orchestrator DB required).
Startup terminal cleanup removes stale workspaces for issues already in terminal states.

8. Polling, Scheduling, and Reconciliation

8.1 Poll Loop

At startup, the service validates config, performs startup cleanup, schedules an immediate tick, and then repeats every polling.interval_ms.

The effective poll interval should be updated when workflow config changes are re-applied.

Tick sequence:

Reconcile running issues.
Run dispatch preflight validation.
Fetch candidate issues from tracker using active states.
Sort issues by dispatch priority.
Dispatch eligible issues while slots remain.
Notify observability/status consumers of state changes.

If per-tick validation fails, dispatch is skipped for that tick, but reconciliation still happens first.

8.2 Candidate Selection Rules

An issue is dispatch-eligible only if all are true:

It has id, identifier, title, and state.
Its state is in active_states and not in terminal_states.
It is not already in running.
It is not already in claimed.
Global concurrency slots are available.
Per-state concurrency slots are available.
Blocker rule for Todo state passes:
- If the issue state is Todo, do not dispatch when any blocker is non-terminal.

Sorting order (stable intent):

priority ascending (1..4 are preferred; null/unknown sorts last)
created_at oldest first
identifier lexicographic tie-breaker

8.3 Concurrency Control

Global limit:

available_slots = max(max_concurrent_agents - running_count, 0)

Per-state limit:

max_concurrent_agents_by_state[state] if present (state key normalized)
otherwise fallback to global limit

The runtime counts issues by their current tracked state in the running map.

8.4 Retry and Backoff

Retry entry creation:

Cancel any existing retry timer for the same issue.
Store attempt, identifier, error, due_at_ms, and new timer handle.

Backoff formula:

Normal continuation retries after a clean worker exit use a short fixed delay of 1000 ms.
Failure-driven retries use delay = min(10000 * 2^(attempt - 1), agent.max_retry_backoff_ms).
Power is capped by the configured max retry backoff (default 300000 / 5m).

Retry handling behavior:

Fetch active candidate issues (not all issues).
Find the specific issue by issue_id.
If not found, release claim.
If found and still candidate-eligible:
- Dispatch if slots are available.
- Otherwise requeue with error no available orchestrator slots.
If found but no longer active, release claim.

Note:

Terminal-state workspace cleanup is handled by startup cleanup and active-run reconciliation (including terminal transitions for currently running issues).
Retry handling mainly operates on active candidates and releases claims when the issue is absent, rather than performing terminal cleanup itself.

8.5 Active Run Reconciliation

Reconciliation runs every tick and has two parts.

Part A: Stall detection

For each running issue, compute elapsed_ms since:
- last_codex_timestamp if any event has been seen, else
- started_at
If elapsed_ms > codex.stall_timeout_ms, terminate the worker and queue a retry.
If stall_timeout_ms <= 0, skip stall detection entirely.

Part B: Tracker state refresh

Fetch current issue states for all running issue IDs.
For each running issue:
- If tracker state is terminal: terminate worker and clean workspace.
- If tracker state is still active: update the in-memory issue snapshot.
- If tracker state is neither active nor terminal: terminate worker without workspace cleanup.
If state refresh fails, keep workers running and try again on the next tick.

8.6 Startup Terminal Workspace Cleanup

When the service starts:

Query tracker for issues in terminal states.
For each returned issue identifier, remove the corresponding workspace directory.
If the terminal-issues fetch fails, log a warning and continue startup.

This prevents stale terminal workspaces from accumulating after restarts.

9. Workspace Management and Safety

9.1 Workspace Layout

Workspace root:

workspace.root (normalized path; the current config layer expands path-like values and preserves bare relative names)

Per-issue workspace path:

<workspace.root>/<sanitized_issue_identifier>

Workspace persistence:

Workspaces are reused across runs for the same issue.
Successful runs do not auto-delete workspaces.

9.2 Workspace Creation and Reuse

Input: issue.identifier

Algorithm summary:

Sanitize identifier to workspace_key.
Compute workspace path under workspace root.
Ensure the workspace path exists as a directory.
Mark created_now=true only if the directory was created during this call; otherwise created_now=false.
If created_now=true, run after_create hook if configured.

Notes:

This section does not assume any specific repository/VCS workflow.
Workspace preparation beyond directory creation (for example dependency bootstrap, checkout/sync, code generation) is implementation-defined and is typically handled via hooks.

9.3 Optional Workspace Population (Implementation-Defined)

The spec does not require any built-in VCS or repository bootstrap behavior.

Implementations may populate or synchronize the workspace using implementation-defined logic and/or hooks (for example after_create and/or before_run).

Failure handling:

Workspace population/synchronization failures return an error for the current attempt.
If failure happens while creating a brand-new workspace, implementations may remove the partially prepared directory.
Reused workspaces should not be destructively reset on population failure unless that policy is explicitly chosen and documented.

9.4 Workspace Hooks

Supported hooks:

hooks.after_create
hooks.before_run
hooks.after_run
hooks.before_remove

Execution contract:

Execute in a local shell context appropriate to the host OS, with the workspace directory as cwd.
On POSIX systems, sh -lc <script> (or a stricter equivalent such as bash -lc <script>) is a conforming default.
Hook timeout uses hooks.timeout_ms; default: 60000 ms.
Log hook start, failures, and timeouts.

Failure semantics:

after_create failure or timeout is fatal to workspace creation.
before_run failure or timeout is fatal to the current run attempt.
after_run failure or timeout is logged and ignored.
before_remove failure or timeout is logged and ignored.

9.5 Safety Invariants

This is the most important portability constraint.

Invariant 1: Run the coding agent only in the per-issue workspace path.

Before launching the coding-agent subprocess, validate:
- cwd == workspace_path

Invariant 2: Workspace path must stay inside workspace root.

Normalize both paths to absolute.
Require workspace_path to have workspace_root as a prefix directory.
Reject any path outside the workspace root.

Invariant 3: Workspace key is sanitized.

Only [A-Za-z0-9._-] allowed in workspace directory names.
Replace all other characters with _.

10. Agent Runner Protocol (Coding Agent Integration)

This section defines the language-neutral contract for integrating a coding agent app-server.

Compatibility profile:

The normative contract is message ordering, required behaviors, and the logical fields that must be extracted (for example session IDs, completion state, approval handling, and usage/rate-limit telemetry).
Exact JSON field names may vary slightly across compatible app-server versions.
Implementations should tolerate equivalent payload shapes when they carry the same logical meaning, especially for nested IDs, approval requests, user-input-required signals, and token/rate-limit metadata.

10.1 Launch Contract

Subprocess launch parameters:

Command: codex.command
Invocation: bash -lc <codex.command>
Working directory: workspace path
Stdout/stderr: separate streams
Framing: line-delimited protocol messages on stdout (JSON-RPC-like JSON per line)

Notes:

The default command is codex app-server.
Approval policy, cwd, and prompt are expressed in the protocol messages in Section 10.2.

Recommended additional process settings:

Max line size: 10 MB (for safe buffering)

10.2 Session Startup Handshake

Reference: https://developers.openai.com/codex/app-server/

The client must send these protocol messages in order:

Illustrative startup transcript (equivalent payload shapes are acceptable if they preserve the same semantics):

{"id":1,"method":"initialize","params":{"clientInfo":{"name":"symphony","version":"1.0"},"capabilities":{}}}
{"method":"initialized","params":{}}
{"id":2,"method":"thread/start","params":{"approvalPolicy":"<implementation-defined>","sandbox":"<implementation-defined>","cwd":"/abs/workspace"}}
{"id":3,"method":"turn/start","params":{"threadId":"<thread-id>","input":[{"type":"text","text":"<rendered prompt-or-continuation-guidance>"}],"cwd":"/abs/workspace","title":"ABC-123: Example","approvalPolicy":"<implementation-defined>","sandboxPolicy":{"type":"<implementation-defined>"}}}

initialize request
- Params include:
  - clientInfo object (for example {name, version})
  - capabilities object (may be empty)
- If the targeted Codex app-server requires capability negotiation for dynamic tools, include the necessary capability flag(s) here.
- Wait for response (read_timeout_ms)
initialized notification
thread/start request
- Params include:
  - approvalPolicy = implementation-defined session approval policy value
  - sandbox = implementation-defined session sandbox value
  - cwd = absolute workspace path
  - If optional client-side tools are implemented, include their advertised tool specs using the protocol mechanism supported by the targeted Codex app-server version.
turn/start request
- Params include:
  - threadId
  - input = single text item containing rendered prompt for the first turn, or continuation guidance for later turns on the same thread
  - cwd
  - title = <issue.identifier>: <issue.title>
  - approvalPolicy = implementation-defined turn approval policy value
  - sandboxPolicy = implementation-defined object-form sandbox policy payload when required by the targeted app-server version

Session identifiers:

Read thread_id from thread/start result result.thread.id
Read turn_id from each turn/start result result.turn.id
Emit session_id = "<thread_id>-<turn_id>"
Reuse the same thread_id for all continuation turns inside one worker run

10.3 Streaming Turn Processing

The client reads line-delimited messages until the turn terminates.

Completion conditions:

turn/completed -> success
turn/failed -> failure
turn/cancelled -> failure
turn timeout (turn_timeout_ms) -> failure
subprocess exit -> failure

Continuation processing:

If the worker decides to continue after a successful turn, it should issue another turn/start on the same live threadId.
The app-server subprocess should remain alive across those continuation turns and be stopped only when the worker run is ending.

Line handling requirements:

Read protocol messages from stdout only.
Buffer partial stdout lines until newline arrives.
Attempt JSON parse on complete stdout lines.
Stderr is not part of the protocol stream:
- ignore it or log it as diagnostics
- do not attempt protocol JSON parsing on stderr

10.4 Emitted Runtime Events (Upstream to Orchestrator)

The app-server client emits structured events to the orchestrator callback. Each event should include:

event (enum/string)
timestamp (UTC timestamp)
codex_app_server_pid (if available)
optional usage map (token counts)
payload fields as needed

Important emitted events may include:

session_started
startup_failed
turn_completed
turn_failed
turn_cancelled
turn_ended_with_error
turn_input_required
approval_auto_approved
unsupported_tool_call
notification
other_message
malformed

10.5 Approval, Tool Calls, and User Input Policy

Approval, sandbox, and user-input behavior is implementation-defined.

Policy requirements:

Each implementation should document its chosen approval, sandbox, and operator-confirmation posture.
Approval requests and user-input-required events must not leave a run stalled indefinitely. An implementation should either satisfy them, surface them to an operator, auto-resolve them, or fail the run according to its documented policy.

Example high-trust behavior:

Auto-approve command execution approvals for the session.
Auto-approve file-change approvals for the session.
Treat user-input-required turns as hard failure.

Unsupported dynamic tool calls:

Supported dynamic tool calls that are explicitly implemented and advertised by the runtime should be handled according to their extension contract.
If the agent requests a dynamic tool call (item/tool/call) that is not supported, return a tool failure response and continue the session.
This prevents the session from stalling on unsupported tool execution paths.

Optional client-side tool extension:

An implementation may expose a limited set of client-side tools to the app-server session.
Current optional standardized tool: linear_graphql.
If implemented, supported tools should be advertised to the app-server session during startup using the protocol mechanism supported by the targeted Codex app-server version.
Unsupported tool names should still return a failure result and continue the session.

linear_graphql extension contract:

Purpose: execute a raw GraphQL query or mutation against Linear using Symphony's configured tracker auth for the current session.
Availability: only meaningful when tracker.kind == "linear" and valid Linear auth is configured.

Preferred input shape:

{
  "query": "single GraphQL query or mutation document",
  "variables": {
    "optional": "graphql variables object"
  }
}

query must be a non-empty string.
query must contain exactly one GraphQL operation.
variables is optional and, when present, must be a JSON object.
Implementations may additionally accept a raw GraphQL query string as shorthand input.
Execute one GraphQL operation per tool call.
If the provided document contains multiple operations, reject the tool call as invalid input.
operationName selection is intentionally out of scope for this extension.
Reuse the configured Linear endpoint and auth from the active Symphony workflow/runtime config; do not require the coding agent to read raw tokens from disk.
Tool result semantics:
- transport success + no top-level GraphQL errors -> success=true
- top-level GraphQL errors present -> success=false, but preserve the GraphQL response body for debugging
- invalid input, missing auth, or transport failure -> success=false with an error payload
Return the GraphQL response or error payload as structured tool output that the model can inspect in-session.

Illustrative responses (equivalent payload shapes are acceptable if they preserve the same outcome):

{"id":"<approval-id>","result":{"approved":true}}
{"id":"<tool-call-id>","result":{"success":false,"error":"unsupported_tool_call"}}

Hard failure on user input requirement:

If the agent requests user input, fail the run attempt immediately.
The client detects this via:
- explicit method (item/tool/requestUserInput), or
- turn methods/flags indicating input is required.

10.6 Timeouts and Error Mapping

Timeouts:

codex.read_timeout_ms: request/response timeout during startup and sync requests
codex.turn_timeout_ms: total turn stream timeout
codex.stall_timeout_ms: enforced by orchestrator based on event inactivity

Error mapping (recommended normalized categories):

codex_not_found
invalid_workspace_cwd
response_timeout
turn_timeout
port_exit
response_error
turn_failed
turn_cancelled
turn_input_required

10.7 Agent Runner Contract

The Agent Runner wraps workspace + prompt + app-server client.

Behavior:

Create/reuse workspace for issue.
Build prompt from workflow template.
Start app-server session.
Forward app-server events to orchestrator.
On any error, fail the worker attempt (the orchestrator will retry).

Note:

Workspaces are intentionally preserved after successful runs.

11. Issue Tracker Integration Contract (Linear-Compatible)

11.1 Required Operations

An implementation must support these tracker adapter operations:

fetch_candidate_issues()
- Return issues in configured active states for a configured project.
fetch_issues_by_states(state_names)
- Used for startup terminal cleanup.
fetch_issue_states_by_ids(issue_ids)
- Used for active-run reconciliation.

11.2 Query Semantics (Linear)

Linear-specific requirements for tracker.kind == "linear":

tracker.kind == "linear"
GraphQL endpoint (default https://api.linear.app/graphql)
Auth token sent in Authorization header
tracker.project_slug maps to Linear project slugId
Candidate issue query filters project using project: { slugId: { eq: $projectSlug } }
Issue-state refresh query uses GraphQL issue IDs with variable type [ID!]
Pagination required for candidate issues
Page size default: 50
Network timeout: 30000 ms

Important:

Linear GraphQL schema details can drift. Keep query construction isolated and test the exact query fields/types required by this specification.

A non-Linear implementation may change transport details, but the normalized outputs must match the domain model in Section 4.

11.3 Normalization Rules

Candidate issue normalization should produce fields listed in Section 4.1.1.

Additional normalization details:

labels -> lowercase strings
blocked_by -> derived from inverse relations where relation type is blocks
priority -> integer only (non-integers become null)
created_at and updated_at -> parse ISO-8601 timestamps

11.4 Error Handling Contract

Recommended error categories:

unsupported_tracker_kind
missing_tracker_api_key
missing_tracker_project_slug
linear_api_request (transport failures)
linear_api_status (non-200 HTTP)
linear_graphql_errors
linear_unknown_payload
linear_missing_end_cursor (pagination integrity error)

Orchestrator behavior on tracker errors:

Candidate fetch failure: log and skip dispatch for this tick.
Running-state refresh failure: log and keep active workers running.
Startup terminal cleanup failure: log warning and continue startup.

11.5 Tracker Writes (Important Boundary)

Symphony does not require first-class tracker write APIs in the orchestrator.

Ticket mutations (state transitions, comments, PR metadata) are typically handled by the coding agent using tools defined by the workflow prompt.
The service remains a scheduler/runner and tracker reader.
Workflow-specific success often means "reached the next handoff state" (for example Human Review) rather than tracker terminal state Done.
If the optional linear_graphql client-side tool extension is implemented, it is still part of the agent toolchain rather than orchestrator business logic.

12. Prompt Construction and Context Assembly

12.1 Inputs

Inputs to prompt rendering:

workflow.prompt_template
normalized issue object
optional attempt integer (retry/continuation metadata)

12.2 Rendering Rules

Render with strict variable checking.
Render with strict filter checking.
Convert issue object keys to strings for template compatibility.
Preserve nested arrays/maps (labels, blockers) so templates can iterate.

12.3 Retry/Continuation Semantics

attempt should be passed to the template because the workflow prompt may provide different instructions for:

first run (attempt null or absent)
continuation run after a successful prior session
retry after error/timeout/stall

12.4 Failure Semantics

If prompt rendering fails:

Fail the run attempt immediately.
Let the orchestrator treat it like any other worker failure and decide retry behavior.

13. Logging, Status, and Observability

13.1 Logging Conventions

Required context fields for issue-related logs:

issue_id
issue_identifier

Required context for coding-agent session lifecycle logs:

session_id

Message formatting requirements:

Use stable key=value phrasing.
Include action outcome (completed, failed, retrying, etc.).
Include concise failure reason when present.
Avoid logging large raw payloads unless necessary.

13.2 Logging Outputs and Sinks

The spec does not prescribe where logs must go (stderr, file, remote sink, etc.).

Requirements:

Operators must be able to see startup/validation/dispatch failures without attaching a debugger.
Implementations may write to one or more sinks.
If a configured log sink fails, the service should continue running when possible and emit an operator-visible warning through any remaining sink.

13.3 Runtime Snapshot / Monitoring Interface (Optional but Recommended)

If the implementation exposes a synchronous runtime snapshot (for dashboards or monitoring), it should return:

running (list of running session rows)
each running row should include turn_count
retrying (list of retry queue rows)
codex_totals
- input_tokens
- output_tokens
- total_tokens
- seconds_running (aggregate runtime seconds as of snapshot time, including active sessions)
rate_limits (latest coding-agent rate limit payload, if available)

Recommended snapshot error modes:

timeout
unavailable

13.4 Optional Human-Readable Status Surface

A human-readable status surface (terminal output, dashboard, etc.) is optional and implementation-defined.

If present, it should draw from orchestrator state/metrics only and must not be required for correctness.

13.5 Session Metrics and Token Accounting

Token accounting rules:

Agent events may include token counts in multiple payload shapes.
Prefer absolute thread totals when available, such as:
- thread/tokenUsage/updated payloads
- total_token_usage within token-count wrapper events
Ignore delta-style payloads such as last_token_usage for dashboard/API totals.
Extract input/output/total token counts leniently from common field names within the selected payload.
For absolute totals, track deltas relative to last reported totals to avoid double-counting.
Do not treat generic usage maps as cumulative totals unless the event type defines them that way.
Accumulate aggregate totals in orchestrator state.

Runtime accounting:

Runtime should be reported as a live aggregate at snapshot/render time.
Implementations may maintain a cumulative counter for ended sessions and add active-session elapsed time derived from running entries (for example started_at) when producing a snapshot/status view.
Add run duration seconds to the cumulative ended-session runtime when a session ends (normal exit or cancellation/termination).
Continuous background ticking of runtime totals is not required.

Rate-limit tracking:

Track the latest rate-limit payload seen in any agent update.
Any human-readable presentation of rate-limit data is implementation-defined.

13.6 Humanized Agent Event Summaries (Optional)

Humanized summaries of raw agent protocol events are optional.

If implemented:

Treat them as observability-only output.
Do not make orchestrator logic depend on humanized strings.

13.7 Optional HTTP Server Extension

This section defines an optional HTTP interface for observability and operational control.

If implemented:

The HTTP server is an extension and is not required for conformance.
The implementation may serve server-rendered HTML or a client-side application for the dashboard.
The dashboard/API must be observability/control surfaces only and must not become required for orchestrator correctness.

Enablement (extension):

Start the HTTP server when a CLI --port argument is provided.
Start the HTTP server when server.port is present in WORKFLOW.md front matter.
server.port is extension configuration and is intentionally not part of the core front-matter schema in Section 5.3.
Precedence: CLI --port overrides server.port when both are present.
server.port must be an integer. Positive values bind that port. 0 may be used to request an ephemeral port for local development and tests.
Implementations should bind loopback by default (127.0.0.1 or host equivalent) unless explicitly configured otherwise.
Changes to HTTP listener settings (for example server.port) do not need to hot-rebind; restart-required behavior is conformant.

13.7.1 Human-Readable Dashboard (`/`)

Host a human-readable dashboard at /.
The returned document should depict the current state of the system (for example active sessions, retry delays, token consumption, runtime totals, recent events, and health/error indicators).
It is up to the implementation whether this is server-generated HTML or a client-side app that consumes the JSON API below.

13.7.2 JSON REST API (`/api/v1/*`)

Provide a JSON REST API under /api/v1/* for current runtime state and operational debugging.

Minimum endpoints:

GET /api/v1/state

Returns a summary view of the current system state (running sessions, retry queue/delays, aggregate token/runtime totals, latest rate limits, and any additional tracked summary fields).

Suggested response shape:

{
  "generated_at": "2026-02-24T20:15:30Z",
  "counts": {
    "running": 2,
    "retrying": 1
  },
  "running": [
    {
      "issue_id": "abc123",
      "issue_identifier": "MT-649",
      "state": "In Progress",
      "session_id": "thread-1-turn-1",
      "turn_count": 7,
      "last_event": "turn_completed",
      "last_message": "",
      "started_at": "2026-02-24T20:10:12Z",
      "last_event_at": "2026-02-24T20:14:59Z",
      "tokens": {
        "input_tokens": 1200,
        "output_tokens": 800,
        "total_tokens": 2000
      }
    }
  ],
  "retrying": [
    {
      "issue_id": "def456",
      "issue_identifier": "MT-650",
      "attempt": 3,
      "due_at": "2026-02-24T20:16:00Z",
      "error": "no available orchestrator slots"
    }
  ],
  "codex_totals": {
    "input_tokens": 5000,
    "output_tokens": 2400,
    "total_tokens": 7400,
    "seconds_running": 1834.2
  },
  "rate_limits": null
}

GET /api/v1/<issue_identifier>

Returns issue-specific runtime/debug details for the identified issue, including any information the implementation tracks that is useful for debugging.

Suggested response shape:

{
  "issue_identifier": "MT-649",
  "issue_id": "abc123",
  "status": "running",
  "workspace": {
    "path": "/tmp/symphony_workspaces/MT-649"
  },
  "attempts": {
    "restart_count": 1,
    "current_retry_attempt": 2
  },
  "running": {
    "session_id": "thread-1-turn-1",
    "turn_count": 7,
    "state": "In Progress",
    "started_at": "2026-02-24T20:10:12Z",
    "last_event": "notification",
    "last_message": "Working on tests",
    "last_event_at": "2026-02-24T20:14:59Z",
    "tokens": {
      "input_tokens": 1200,
      "output_tokens": 800,
      "total_tokens": 2000
    }
  },
  "retry": null,
  "logs": {
    "codex_session_logs": [
      {
        "label": "latest",
        "path": "/var/log/symphony/codex/MT-649/latest.log",
        "url": null
      }
    ]
  },
  "recent_events": [
    {
      "at": "2026-02-24T20:14:59Z",
      "event": "notification",
      "message": "Working on tests"
    }
  ],
  "last_error": null,
  "tracked": {}
}

If the issue is unknown to the current in-memory state, return 404 with an error response (for example {\"error\":{\"code\":\"issue_not_found\",\"message\":\"...\"}}).

POST /api/v1/refresh
- Queues an immediate tracker poll + reconciliation cycle (best-effort trigger; implementations may coalesce repeated requests).
- Suggested request body: empty body or {}.
- Suggested response (202 Accepted) shape:
```
{
  "queued": true,
  "coalesced": false,
  "requested_at": "2026-02-24T20:15:30Z",
  "operations": ["poll", "reconcile"]
}
```

API design notes:

The JSON shapes above are the recommended baseline for interoperability and debugging ergonomics.
Implementations may add fields, but should avoid breaking existing fields within a version.
Endpoints should be read-only except for operational triggers like /refresh.
Unsupported methods on defined routes should return 405 Method Not Allowed.
API errors should use a JSON envelope such as {"error":{"code":"...","message":"..."}}.
If the dashboard is a client-side app, it should consume this API rather than duplicating state logic.

14. Failure Model and Recovery Strategy

14.1 Failure Classes

Workflow/Config Failures
- Missing WORKFLOW.md
- Invalid YAML front matter
- Unsupported tracker kind or missing tracker credentials/project slug
- Missing coding-agent executable
Workspace Failures
- Workspace directory creation failure
- Workspace population/synchronization failure (implementation-defined; may come from hooks)
- Invalid workspace path configuration
- Hook timeout/failure
Agent Session Failures
- Startup handshake failure
- Turn failed/cancelled
- Turn timeout
- User input requested (hard fail)
- Subprocess exit
- Stalled session (no activity)
Tracker Failures
- API transport errors
- Non-200 status
- GraphQL errors
- malformed payloads
Observability Failures
- Snapshot timeout
- Dashboard render errors
- Log sink configuration failure

14.2 Recovery Behavior

Dispatch validation failures:
- Skip new dispatches.
- Keep service alive.
- Continue reconciliation where possible.
Worker failures:
- Convert to retries with exponential backoff.
Tracker candidate-fetch failures:
- Skip this tick.
- Try again on next tick.
Reconciliation state-refresh failures:
- Keep current workers.
- Retry on next tick.
Dashboard/log failures:
- Do not crash the orchestrator.

14.3 Partial State Recovery (Restart)

Current design is intentionally in-memory for scheduler state.

After restart:

No retry timers are restored from prior process memory.
No running sessions are assumed recoverable.
Service recovers by:
- startup terminal workspace cleanup
- fresh polling of active issues
- re-dispatching eligible work

14.4 Operator Intervention Points

Operators can control behavior by:

Editing WORKFLOW.md (prompt and most runtime settings).
WORKFLOW.md changes should be detected and re-applied automatically without restart.
Changing issue states in the tracker:
- terminal state -> running session is stopped and workspace cleaned when reconciled
- non-active state -> running session is stopped without cleanup
Restarting the service for process recovery or deployment (not as the normal path for applying workflow config changes).

15. Security and Operational Safety

15.1 Trust Boundary Assumption

Each implementation defines its own trust boundary.

Operational safety requirements:

Implementations should state clearly whether they are intended for trusted environments, more restrictive environments, or both.
Implementations should state clearly whether they rely on auto-approved actions, operator approvals, stricter sandboxing, or some combination of those controls.
Workspace isolation and path validation are important baseline controls, but they are not a substitute for whatever approval and sandbox policy an implementation chooses.

15.2 Filesystem Safety Requirements

Mandatory:

Workspace path must remain under configured workspace root.
Coding-agent cwd must be the per-issue workspace path for the current run.
Workspace directory names must use sanitized identifiers.

Recommended additional hardening for ports:

Run under a dedicated OS user.
Restrict workspace root permissions.
Mount workspace root on a dedicated volume if possible.

15.3 Secret Handling

Support $VAR indirection in workflow config.
Do not log API tokens or secret env values.
Validate presence of secrets without printing them.

15.4 Hook Script Safety

Workspace hooks are arbitrary shell scripts from WORKFLOW.md.

Implications:

Hooks are fully trusted configuration.
Hooks run inside the workspace directory.
Hook output should be truncated in logs.
Hook timeouts are required to avoid hanging the orchestrator.

15.5 Harness Hardening Guidance

Running Codex agents against repositories, issue trackers, and other inputs that may contain sensitive data or externally-controlled content can be dangerous. A permissive deployment can lead to data leaks, destructive mutations, or full machine compromise if the agent is induced to execute harmful commands or use overly-powerful integrations.

Implementations should explicitly evaluate their own risk profile and harden the execution harness where appropriate. This specification intentionally does not mandate a single hardening posture, but ports should not assume that tracker data, repository contents, prompt inputs, or tool arguments are fully trustworthy just because they originate inside a normal workflow.

Possible hardening measures include:

Tightening Codex approval and sandbox settings described elsewhere in this specification instead of running with a maximally permissive configuration.
Adding external isolation layers such as OS/container/VM sandboxing, network restrictions, or separate credentials beyond the built-in Codex policy controls.
Filtering which Linear issues, projects, teams, labels, or other tracker sources are eligible for dispatch so untrusted or out-of-scope tasks do not automatically reach the agent.
Narrowing the optional linear_graphql tool so it can only read or mutate data inside the intended project scope, rather than exposing general workspace-wide tracker access.
Reducing the set of client-side tools, credentials, filesystem paths, and network destinations available to the agent to the minimum needed for the workflow.

The correct controls are deployment-specific, but implementations should document them clearly and treat harness hardening as part of the core safety model rather than an optional afterthought.

16. Reference Algorithms (Language-Agnostic)

16.1 Service Startup

function start_service():
  configure_logging()
  start_observability_outputs()
  start_workflow_watch(on_change=reload_and_reapply_workflow)

  state = {
    poll_interval_ms: get_config_poll_interval_ms(),
    max_concurrent_agents: get_config_max_concurrent_agents(),
    running: {},
    claimed: set(),
    retry_attempts: {},
    completed: set(),
    codex_totals: {input_tokens: 0, output_tokens: 0, total_tokens: 0, seconds_running: 0},
    codex_rate_limits: null
  }

  validation = validate_dispatch_config()
  if validation is not ok:
    log_validation_error(validation)
    fail_startup(validation)

  startup_terminal_workspace_cleanup()
  schedule_tick(delay_ms=0)

  event_loop(state)

16.2 Poll-and-Dispatch Tick

on_tick(state):
  state = reconcile_running_issues(state)

  validation = validate_dispatch_config()
  if validation is not ok:
    log_validation_error(validation)
    notify_observers()
    schedule_tick(state.poll_interval_ms)
    return state

  issues = tracker.fetch_candidate_issues()
  if issues failed:
    log_tracker_error()
    notify_observers()
    schedule_tick(state.poll_interval_ms)
    return state

  for issue in sort_for_dispatch(issues):
    if no_available_slots(state):
      break

    if should_dispatch(issue, state):
      state = dispatch_issue(issue, state, attempt=null)

  notify_observers()
  schedule_tick(state.poll_interval_ms)
  return state

16.3 Reconcile Active Runs

function reconcile_running_issues(state):
  state = reconcile_stalled_runs(state)

  running_ids = keys(state.running)
  if running_ids is empty:
    return state

  refreshed = tracker.fetch_issue_states_by_ids(running_ids)
  if refreshed failed:
    log_debug("keep workers running")
    return state

  for issue in refreshed:
    if issue.state in terminal_states:
      state = terminate_running_issue(state, issue.id, cleanup_workspace=true)
    else if issue.state in active_states:
      state.running[issue.id].issue = issue
    else:
      state = terminate_running_issue(state, issue.id, cleanup_workspace=false)

  return state

16.4 Dispatch One Issue

function dispatch_issue(issue, state, attempt):
  worker = spawn_worker(
    fn -> run_agent_attempt(issue, attempt, parent_orchestrator_pid) end
  )

  if worker spawn failed:
    return schedule_retry(state, issue.id, next_attempt(attempt), {
      identifier: issue.identifier,
      error: "failed to spawn agent"
    })

  state.running[issue.id] = {
    worker_handle,
    monitor_handle,
    identifier: issue.identifier,
    issue,
    session_id: null,
    codex_app_server_pid: null,
    last_codex_message: null,
    last_codex_event: null,
    last_codex_timestamp: null,
    codex_input_tokens: 0,
    codex_output_tokens: 0,
    codex_total_tokens: 0,
    last_reported_input_tokens: 0,
    last_reported_output_tokens: 0,
    last_reported_total_tokens: 0,
    retry_attempt: normalize_attempt(attempt),
    started_at: now_utc()
  }

  state.claimed.add(issue.id)
  state.retry_attempts.remove(issue.id)
  return state

16.5 Worker Attempt (Workspace + Prompt + Agent)

function run_agent_attempt(issue, attempt, orchestrator_channel):
  workspace = workspace_manager.create_for_issue(issue.identifier)
  if workspace failed:
    fail_worker("workspace error")

  if run_hook("before_run", workspace.path) failed:
    fail_worker("before_run hook error")

  session = app_server.start_session(workspace=workspace.path)
  if session failed:
    run_hook_best_effort("after_run", workspace.path)
    fail_worker("agent session startup error")

  max_turns = config.agent.max_turns
  turn_number = 1

  while true:
    prompt = build_turn_prompt(workflow_template, issue, attempt, turn_number, max_turns)
    if prompt failed:
      app_server.stop_session(session)
      run_hook_best_effort("after_run", workspace.path)
      fail_worker("prompt error")

    turn_result = app_server.run_turn(
      session=session,
      prompt=prompt,
      issue=issue,
      on_message=(msg) -> send(orchestrator_channel, {codex_update, issue.id, msg})
    )

    if turn_result failed:
      app_server.stop_session(session)
      run_hook_best_effort("after_run", workspace.path)
      fail_worker("agent turn error")

    refreshed_issue = tracker.fetch_issue_states_by_ids([issue.id])
    if refreshed_issue failed:
      app_server.stop_session(session)
      run_hook_best_effort("after_run", workspace.path)
      fail_worker("issue state refresh error")

    issue = refreshed_issue[0] or issue

    if issue.state is not active:
      break

    if turn_number >= max_turns:
      break

    turn_number = turn_number + 1

  app_server.stop_session(session)
  run_hook_best_effort("after_run", workspace.path)

  exit_normal()

16.6 Worker Exit and Retry Handling

on_worker_exit(issue_id, reason, state):
  running_entry = state.running.remove(issue_id)
  state = add_runtime_seconds_to_totals(state, running_entry)

  if reason == normal:
    state.completed.add(issue_id)  # bookkeeping only
    state = schedule_retry(state, issue_id, 1, {
      identifier: running_entry.identifier,
      delay_type: continuation
    })
  else:
    state = schedule_retry(state, issue_id, next_attempt_from(running_entry), {
      identifier: running_entry.identifier,
      error: format("worker exited: %reason")
    })

  notify_observers()
  return state

on_retry_timer(issue_id, state):
  retry_entry = state.retry_attempts.pop(issue_id)
  if missing:
    return state

  candidates = tracker.fetch_candidate_issues()
  if fetch failed:
    return schedule_retry(state, issue_id, retry_entry.attempt + 1, {
      identifier: retry_entry.identifier,
      error: "retry poll failed"
    })

  issue = find_by_id(candidates, issue_id)
  if issue is null:
    state.claimed.remove(issue_id)
    return state

  if available_slots(state) == 0:
    return schedule_retry(state, issue_id, retry_entry.attempt + 1, {
      identifier: issue.identifier,
      error: "no available orchestrator slots"
    })

  return dispatch_issue(issue, state, attempt=retry_entry.attempt)

17. Test and Validation Matrix

A conforming implementation should include tests that cover the behaviors defined in this specification.

Validation profiles:

Core Conformance: deterministic tests required for all conforming implementations.
Extension Conformance: required only for optional features that an implementation chooses to ship.
Real Integration Profile: environment-dependent smoke/integration checks recommended before production use.

Unless otherwise noted, Sections 17.1 through 17.7 are Core Conformance. Bullets that begin with If ... is implemented are Extension Conformance.

17.1 Workflow and Config Parsing

Workflow file path precedence:
- explicit runtime path is used when provided
- cwd default is WORKFLOW.md when no explicit runtime path is provided
Workflow file changes are detected and trigger re-read/re-apply without restart
Invalid workflow reload keeps last known good effective configuration and emits an operator-visible error
Missing WORKFLOW.md returns typed error
Invalid YAML front matter returns typed error
Front matter non-map returns typed error
Config defaults apply when optional values are missing
tracker.kind validation enforces currently supported kind (linear)
tracker.api_key works (including $VAR indirection)
$VAR resolution works for tracker API key and path values
~ path expansion works
codex.command is preserved as a shell command string
Per-state concurrency override map normalizes state names and ignores invalid values
Prompt template renders issue and attempt
Prompt rendering fails on unknown variables (strict mode)

17.2 Workspace Manager and Safety

Deterministic workspace path per issue identifier
Missing workspace directory is created
Existing workspace directory is reused
Existing non-directory path at workspace location is handled safely (replace or fail per implementation policy)
Optional workspace population/synchronization errors are surfaced
Temporary artifacts (tmp, .elixir_ls) are removed during prep
after_create hook runs only on new workspace creation
before_run hook runs before each attempt and failure/timeouts abort the current attempt
after_run hook runs after each attempt and failure/timeouts are logged and ignored
before_remove hook runs on cleanup and failures/timeouts are ignored
Workspace path sanitization and root containment invariants are enforced before agent launch
Agent launch uses the per-issue workspace path as cwd and rejects out-of-root paths

17.3 Issue Tracker Client

Candidate issue fetch uses active states and project slug
Linear query uses the specified project filter field (slugId)
Empty fetch_issues_by_states([]) returns empty without API call
Pagination preserves order across multiple pages
Blockers are normalized from inverse relations of type blocks
Labels are normalized to lowercase
Issue state refresh by ID returns minimal normalized issues
Issue state refresh query uses GraphQL ID typing ([ID!]) as specified in Section 11.2
Error mapping for request errors, non-200, GraphQL errors, malformed payloads

17.4 Orchestrator Dispatch, Reconciliation, and Retry

Dispatch sort order is priority then oldest creation time
Todo issue with non-terminal blockers is not eligible
Todo issue with terminal blockers is eligible
Active-state issue refresh updates running entry state
Non-active state stops running agent without workspace cleanup
Terminal state stops running agent and cleans workspace
Reconciliation with no running issues is a no-op
Normal worker exit schedules a short continuation retry (attempt 1)
Abnormal worker exit increments retries with 10s-based exponential backoff
Retry backoff cap uses configured agent.max_retry_backoff_ms
Retry queue entries include attempt, due time, identifier, and error
Stall detection kills stalled sessions and schedules retry
Slot exhaustion requeues retries with explicit error reason
If a snapshot API is implemented, it returns running rows, retry rows, token totals, and rate limits
If a snapshot API is implemented, timeout/unavailable cases are surfaced

17.5 Coding-Agent App-Server Client

Launch command uses workspace cwd and invokes bash -lc <codex.command>
Startup handshake sends initialize, initialized, thread/start, turn/start
initialize includes client identity/capabilities payload required by the targeted Codex app-server protocol
Policy-related startup payloads use the implementation's documented approval/sandbox settings
thread/start and turn/start parse nested IDs and emit session_started
Request/response read timeout is enforced
Turn timeout is enforced
Partial JSON lines are buffered until newline
Stdout and stderr are handled separately; protocol JSON is parsed from stdout only
Non-JSON stderr lines are logged but do not crash parsing
Command/file-change approvals are handled according to the implementation's documented policy
Unsupported dynamic tool calls are rejected without stalling the session
User input requests are handled according to the implementation's documented policy and do not stall indefinitely
Usage and rate-limit payloads are extracted from nested payload shapes
Compatible payload variants for approvals, user-input-required signals, and usage/rate-limit telemetry are accepted when they preserve the same logical meaning
If optional client-side tools are implemented, the startup handshake advertises the supported tool specs required for discovery by the targeted app-server version
If the optional linear_graphql client-side tool extension is implemented:
- the tool is advertised to the session
- valid query / variables inputs execute against configured Linear auth
- top-level GraphQL errors produce success=false while preserving the GraphQL body
- invalid arguments, missing auth, and transport failures return structured failure payloads
- unsupported tool names still fail without stalling the session

17.6 Observability

Validation failures are operator-visible
Structured logging includes issue/session context fields
Logging sink failures do not crash orchestration
Token/rate-limit aggregation remains correct across repeated agent updates
If a human-readable status surface is implemented, it is driven from orchestrator state and does not affect correctness
If humanized event summaries are implemented, they cover key wrapper/agent event classes without changing orchestrator behavior

17.7 CLI and Host Lifecycle

CLI accepts an optional positional workflow path argument (path-to-WORKFLOW.md)
CLI uses ./WORKFLOW.md when no workflow path argument is provided
CLI errors on nonexistent explicit workflow path or missing default ./WORKFLOW.md
CLI surfaces startup failure cleanly
CLI exits with success when application starts and shuts down normally
CLI exits nonzero when startup fails or the host process exits abnormally

17.8 Real Integration Profile (Recommended)

These checks are recommended for production readiness and may be skipped in CI when credentials, network access, or external service permissions are unavailable.

A real tracker smoke test can be run with valid credentials supplied by LINEAR_API_KEY or a documented local bootstrap mechanism (for example ~/.linear_api_key).
Real integration tests should use isolated test identifiers/workspaces and clean up tracker artifacts when practical.
A skipped real-integration test should be reported as skipped, not silently treated as passed.
If a real-integration profile is explicitly enabled in CI or release validation, failures should fail that job.

18. Implementation Checklist (Definition of Done)

Use the same validation profiles as Section 17:

Section 18.1 = Core Conformance
Section 18.2 = Extension Conformance
Section 18.3 = Real Integration Profile

18.1 Required for Conformance

Workflow path selection supports explicit runtime path and cwd default
WORKFLOW.md loader with YAML front matter + prompt body split
Typed config layer with defaults and $ resolution
Dynamic WORKFLOW.md watch/reload/re-apply for config and prompt
Polling orchestrator with single-authority mutable state
Issue tracker client with candidate fetch + state refresh + terminal fetch
Workspace manager with sanitized per-issue workspaces
Workspace lifecycle hooks (after_create, before_run, after_run, before_remove)
Hook timeout config (hooks.timeout_ms, default 60000)
Coding-agent app-server subprocess client with JSON line protocol
Codex launch command config (codex.command, default codex app-server)
Strict prompt rendering with issue and attempt variables
Exponential retry queue with continuation retries after normal exit
Configurable retry backoff cap (agent.max_retry_backoff_ms, default 5m)
Reconciliation that stops runs on terminal/non-active tracker states
Workspace cleanup for terminal issues (startup sweep + active transition)
Structured logs with issue_id, issue_identifier, and session_id
Operator-visible observability (structured logs; optional snapshot/status surface)

18.2 Recommended Extensions (Not Required for Conformance)

Optional HTTP server honors CLI --port over server.port, uses a safe default bind host, and exposes the baseline endpoints/error semantics in Section 13.7 if shipped.
Optional linear_graphql client-side tool extension exposes raw Linear GraphQL access through the app-server session using configured Symphony auth.
TODO: Persist retry queue and session metadata across process restarts.
TODO: Make observability settings configurable in workflow front matter without prescribing UI implementation details.
TODO: Add first-class tracker write APIs (comments/state transitions) in the orchestrator instead of only via agent tools.
TODO: Add pluggable issue tracker adapters beyond Linear.

18.3 Operational Validation Before Production (Recommended)

Run the Real Integration Profile from Section 17.8 with valid credentials and network access.
Verify hook execution and workflow path resolution on the target host OS/shell environment.
If the optional HTTP server is shipped, verify the configured port behavior and loopback/default bind expectations on the target environment.

75 KiB Raw Blame History

Symphony Service Specification

1. Problem Statement

2. Goals and Non-Goals

2.1 Goals

2.2 Non-Goals

3. System Overview

3.1 Main Components

3.2 Abstraction Levels

3.3 External Dependencies

4. Core Domain Model

4.1 Entities

4.1.1 Issue

4.1.2 Workflow Definition

4.1.3 Service Config (Typed View)

4.1.4 Workspace

4.1.5 Run Attempt

4.1.6 Live Session (Agent Session Metadata)

4.1.7 Retry Entry

4.1.8 Orchestrator Runtime State

4.2 Stable Identifiers and Normalization Rules

5. Workflow Specification (Repository Contract)

5.1 File Discovery and Path Resolution

5.2 File Format

5.3 Front Matter Schema

5.3.1 tracker (object)

5.3.2 polling (object)

5.3.3 workspace (object)

5.3.4 hooks (object)

5.3.5 agent (object)

5.3.6 codex (object)

5.4 Prompt Template Contract

5.5 Workflow Validation and Error Surface

6. Configuration Specification

6.1 Source Precedence and Resolution Semantics

6.2 Dynamic Reload Semantics

6.3 Dispatch Preflight Validation

6.4 Config Fields Summary (Cheat Sheet)

7. Orchestration State Machine

7.1 Issue Orchestration States

7.2 Run Attempt Lifecycle

7.3 Transition Triggers

7.4 Idempotency and Recovery Rules

8. Polling, Scheduling, and Reconciliation

8.1 Poll Loop

8.2 Candidate Selection Rules

8.3 Concurrency Control

8.4 Retry and Backoff

8.5 Active Run Reconciliation

8.6 Startup Terminal Workspace Cleanup

9. Workspace Management and Safety

9.1 Workspace Layout

9.2 Workspace Creation and Reuse

9.3 Optional Workspace Population (Implementation-Defined)

9.4 Workspace Hooks

9.5 Safety Invariants

10. Agent Runner Protocol (Coding Agent Integration)

10.1 Launch Contract

10.2 Session Startup Handshake

10.3 Streaming Turn Processing

10.4 Emitted Runtime Events (Upstream to Orchestrator)

10.5 Approval, Tool Calls, and User Input Policy

10.6 Timeouts and Error Mapping

10.7 Agent Runner Contract

11. Issue Tracker Integration Contract (Linear-Compatible)

11.1 Required Operations

11.2 Query Semantics (Linear)

11.3 Normalization Rules

11.4 Error Handling Contract

11.5 Tracker Writes (Important Boundary)

12. Prompt Construction and Context Assembly

12.1 Inputs

12.2 Rendering Rules

12.3 Retry/Continuation Semantics

12.4 Failure Semantics

13. Logging, Status, and Observability

13.1 Logging Conventions

13.2 Logging Outputs and Sinks

13.3 Runtime Snapshot / Monitoring Interface (Optional but Recommended)

13.4 Optional Human-Readable Status Surface

75 KiB

Raw Blame History

5.3.1 `tracker` (object)

5.3.2 `polling` (object)

5.3.3 `workspace` (object)

5.3.4 `hooks` (object)

5.3.5 `agent` (object)

5.3.6 `codex` (object)

13.7.1 Human-Readable Dashboard (`/`)

13.7.2 JSON REST API (`/api/v1/*`)