Merge branch 'master' into feat/intelligent-backend-selection

合并 master 分支的最新改动到 PR #61。冲突解决： - dev-workflow/commands/dev.md: 保留 multiSelect backend 选择逻辑 - 保留任务类型字段 type: default|ui|quick-fix - 保留 Backend 路由策略：default→codex, ui→gemini, quick-fix→claude - 修复 heredoc 示例格式合并的 master 改动包括： - codeagent-wrapper v5.4.0 structured execution report (#94) - 修复 PATH 重复条目问题 (#95) - ASCII 模式和性能优化 - 其他 bug 修复和文档更新 Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
fix(dev-workflow): refactor backend selection to multiSelect mode
2026-02-05 02:30:26 +08:00 · 2025-12-25 22:24:15 +08:00 · 2025-12-25 22:08:33 +08:00 · 2025-12-25 11:40:53 +08:00 · 2025-12-25 11:38:42 +08:00 · 2025-12-24 11:59:00 +08:00
27 changed files with 2330 additions and 287 deletions
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -97,11 +97,6 @@ jobs:
        with:
          path: artifacts

-      - name: Setup Node.js
-        uses: actions/setup-node@v4
-        with:
-          node-version: '20'
-
      - name: Prepare release files
        run: |
          mkdir -p release
@@ -109,26 +104,10 @@ jobs:
          cp install.sh install.bat release/
          ls -la release/

-      - name: Generate release notes with git-cliff
-        run: |
-          # Install git-cliff via npx
-          npx git-cliff@latest --current --strip all -o release_notes.md
-
-          # Fallback if generation failed
-          if [ ! -s release_notes.md ]; then
-            echo "⚠️ Failed to generate release notes with git-cliff" > release_notes.md
-            echo "" >> release_notes.md
-            echo "## What's Changed" >> release_notes.md
-            echo "See commits in this release for details." >> release_notes.md
-          fi
-
-          echo "--- Generated Release Notes ---"
-          cat release_notes.md
-
      - name: Create Release
        uses: softprops/action-gh-release@v2
        with:
          files: release/*
-          body_path: release_notes.md
+          generate_release_notes: true
          draft: false
          prerelease: false
--- a/.gitignore
+++ b/.gitignore
@@ -1,5 +1,7 @@
 .claude/
 .claude-trace
+.DS_Store
+**/.DS_Store
 .venv
 .pytest_cache
 __pycache__
--- a/README.md
+++ b/README.md
@@ -132,6 +132,59 @@ Requirements → Architecture → Sprint Plan → Development → Review → QA

 ---

+## Version Requirements
+
+### Codex CLI
+**Minimum version:** Check compatibility with your installation
+
+The codeagent-wrapper uses these Codex CLI features:
+- `codex e` - Execute commands (shorthand for `codex exec`)
+- `--skip-git-repo-check` - Skip git repository validation
+- `--json` - JSON stream output format
+- `-C <workdir>` - Set working directory
+- `resume <session_id>` - Resume previous sessions
+
+**Verify Codex CLI is installed:**
+```bash
+which codex
+codex --version
+```
+
+### Claude CLI
+**Minimum version:** Check compatibility with your installation
+
+Required features:
+- `--output-format stream-json` - Streaming JSON output format
+- `--setting-sources` - Control setting sources (prevents infinite recursion)
+- `--dangerously-skip-permissions` - Skip permission prompts (use with caution)
+- `-p` - Prompt input flag
+- `-r <session_id>` - Resume sessions
+
+**Security Note:** The wrapper only adds `--dangerously-skip-permissions` for Claude when explicitly enabled (e.g. `--skip-permissions` / `CODEAGENT_SKIP_PERMISSIONS=true`). Keep it disabled unless you understand the risk.
+
+**Verify Claude CLI is installed:**
+```bash
+which claude
+claude --version
+```
+
+### Gemini CLI
+**Minimum version:** Check compatibility with your installation
+
+Required features:
+- `-o stream-json` - JSON stream output format
+- `-y` - Auto-approve prompts (non-interactive mode)
+- `-r <session_id>` - Resume sessions
+- `-p` - Prompt input flag
+
+**Verify Gemini CLI is installed:**
+```bash
+which gemini
+gemini --version
+```
+
+---
+
 ## Installation

 ### Modular Installation (Recommended)
@@ -163,15 +216,39 @@ python3 install.py --force

 ```
 ~/.claude/
-├── CLAUDE.md              # Core instructions and role definition
-├── commands/              # Slash commands (/dev, /code, etc.)
-├── agents/                # Agent definitions
+├── bin/
+│   └── codeagent-wrapper    # Main executable
+├── CLAUDE.md                # Core instructions and role definition
+├── commands/                # Slash commands (/dev, /code, etc.)
+├── agents/                  # Agent definitions
 ├── skills/
 │   └── codex/
-│       └── SKILL.md       # Codex integration skill
-└── installed_modules.json # Installation status
+│       └── SKILL.md         # Codex integration skill
+├── config.json              # Configuration
+└── installed_modules.json   # Installation status
 ```

+### Customizing Installation Directory
+
+By default, myclaude installs to `~/.claude`. You can customize this using the `INSTALL_DIR` environment variable:
+
+```bash
+# Install to custom directory
+INSTALL_DIR=/opt/myclaude bash install.sh
+
+# Update your PATH accordingly
+export PATH="/opt/myclaude/bin:$PATH"
+```
+
+**Directory Structure:**
+- `$INSTALL_DIR/bin/` - codeagent-wrapper binary
+- `$INSTALL_DIR/skills/` - Skill definitions
+- `$INSTALL_DIR/config.json` - Configuration file
+- `$INSTALL_DIR/commands/` - Slash command definitions
+- `$INSTALL_DIR/agents/` - Agent definitions
+
+**Note:** When using a custom installation directory, ensure that `$INSTALL_DIR/bin` is added to your `PATH` environment variable.
+
 ### Configuration

 Edit `config.json` to customize:
@@ -294,11 +371,14 @@ setx PATH "%USERPROFILE%\bin;%PATH%"

 **Codex wrapper not found:**
 ```bash
-# Check PATH
-echo $PATH | grep -q "$HOME/bin" || echo 'export PATH="$HOME/bin:$PATH"' >> ~/.zshrc
+# Installer auto-adds PATH, check if configured
+if [[ ":$PATH:" != *":$HOME/.claude/bin:"* ]]; then
+    echo "PATH not configured. Reinstalling..."
+    bash install.sh
+fi

-# Reinstall
-bash install.sh
+# Or manually add (idempotent command)
+[[ ":$PATH:" != *":$HOME/.claude/bin:"* ]] && echo 'export PATH="$HOME/.claude/bin:$PATH"' >> ~/.zshrc
 ```

 **Permission denied:**
@@ -315,6 +395,71 @@ cat ~/.claude/installed_modules.json
 python3 install.py --module dev --force
 ```

+### Version Compatibility Issues
+
+**Backend CLI not found:**
+```bash
+# Check if backend CLIs are installed
+which codex
+which claude
+which gemini
+
+# Install missing backends
+# Codex: Follow installation instructions at https://codex.docs
+# Claude: Follow installation instructions at https://claude.ai/docs
+# Gemini: Follow installation instructions at https://ai.google.dev/docs
+```
+
+**Unsupported CLI flags:**
+```bash
+# If you see errors like "unknown flag" or "invalid option"
+
+# Check backend CLI version
+codex --version
+claude --version
+gemini --version
+
+# For Codex: Ensure it supports `e`, `--skip-git-repo-check`, `--json`, `-C`, and `resume`
+# For Claude: Ensure it supports `--output-format stream-json`, `--setting-sources`, `-r`
+# For Gemini: Ensure it supports `-o stream-json`, `-y`, `-r`, `-p`
+
+# Update your backend CLI to the latest version if needed
+```
+
+**JSON parsing errors:**
+```bash
+# If you see "failed to parse JSON output" errors
+
+# Verify the backend outputs stream-json format
+codex e --json "test task"  # Should output newline-delimited JSON
+claude --output-format stream-json -p "test"  # Should output stream JSON
+
+# If not, your backend CLI version may be too old or incompatible
+```
+
+**Infinite recursion with Claude backend:**
+```bash
+# The wrapper prevents this with `--setting-sources ""` flag
+# If you still see recursion, ensure your Claude CLI supports this flag
+
+claude --help | grep "setting-sources"
+
+# If flag is not supported, upgrade Claude CLI
+```
+
+**Session resume failures:**
+```bash
+# Check if session ID is valid
+codex history  # List recent sessions
+claude history
+
+# Ensure backend CLI supports session resumption
+codex resume <session_id> "test"  # Should continue from previous session
+claude -r <session_id> "test"
+
+# If not supported, use new sessions instead of resume mode
+```
+
 ---

 ## Documentation
--- a/README_CN.md
+++ b/README_CN.md
@@ -152,15 +152,39 @@ python3 install.py --force

 ```
 ~/.claude/
-├── CLAUDE.md              # 核心指令和角色定义
-├── commands/              # 斜杠命令 (/dev, /code 等)
-├── agents/                # 智能体定义
+├── bin/
+│   └── codeagent-wrapper    # 主可执行文件
+├── CLAUDE.md                # 核心指令和角色定义
+├── commands/                # 斜杠命令 (/dev, /code 等)
+├── agents/                  # 智能体定义
 ├── skills/
 │   └── codex/
-│       └── SKILL.md       # Codex 集成技能
-└── installed_modules.json # 安装状态
+│       └── SKILL.md         # Codex 集成技能
+├── config.json              # 配置文件
+└── installed_modules.json   # 安装状态
 ```

+### 自定义安装目录
+
+默认情况下，myclaude 安装到 `~/.claude`。您可以使用 `INSTALL_DIR` 环境变量自定义安装目录：
+
+```bash
+# 安装到自定义目录
+INSTALL_DIR=/opt/myclaude bash install.sh
+
+# 相应更新您的 PATH
+export PATH="/opt/myclaude/bin:$PATH"
+```
+
+**目录结构：**
+- `$INSTALL_DIR/bin/` - codeagent-wrapper 可执行文件
+- `$INSTALL_DIR/skills/` - 技能定义
+- `$INSTALL_DIR/config.json` - 配置文件
+- `$INSTALL_DIR/commands/` - 斜杠命令定义
+- `$INSTALL_DIR/agents/` - 智能体定义
+
+**注意：** 使用自定义安装目录时，请确保将 `$INSTALL_DIR/bin` 添加到您的 `PATH` 环境变量中。
+
 ### 配置

 编辑 `config.json` 自定义：
@@ -283,11 +307,14 @@ setx PATH "%USERPROFILE%\bin;%PATH%"

 **Codex wrapper 未找到：**
 ```bash
-# 检查 PATH
-echo $PATH | grep -q "$HOME/bin" || echo 'export PATH="$HOME/bin:$PATH"' >> ~/.zshrc
+# 安装程序会自动添加 PATH，检查是否已添加
+if [[ ":$PATH:" != *":$HOME/.claude/bin:"* ]]; then
+    echo "PATH not configured. Reinstalling..."
+    bash install.sh
+fi

-# 重新安装
-bash install.sh
+# 或手动添加（幂等性命令）
+[[ ":$PATH:" != *":$HOME/.claude/bin:"* ]] && echo 'export PATH="$HOME/.claude/bin:$PATH"' >> ~/.zshrc
 ```

 **权限被拒绝：**
--- a/codeagent-wrapper/backend.go
+++ b/codeagent-wrapper/backend.go
@@ -1,5 +1,11 @@
 package main

+import (
+	"encoding/json"
+	"os"
+	"path/filepath"
+)
+
 // Backend defines the contract for invoking different AI CLI backends.
 // Each backend is responsible for supplying the executable command and
 // building the argument list based on the wrapper config.
@@ -26,15 +32,62 @@ func (ClaudeBackend) Command() string {
 	return "claude"
 }
 func (ClaudeBackend) BuildArgs(cfg *Config, targetArg string) []string {
+	return buildClaudeArgs(cfg, targetArg)
+}
+
+const maxClaudeSettingsBytes = 1 << 20 // 1MB
+
+// loadMinimalEnvSettings 从 ~/.claude/settings.json 只提取 env 配置。
+// 只接受字符串类型的值；文件缺失/解析失败/超限都返回空。
+func loadMinimalEnvSettings() map[string]string {
+	home, err := os.UserHomeDir()
+	if err != nil || home == "" {
+		return nil
+	}
+
+	settingPath := filepath.Join(home, ".claude", "settings.json")
+	info, err := os.Stat(settingPath)
+	if err != nil || info.Size() > maxClaudeSettingsBytes {
+		return nil
+	}
+
+	data, err := os.ReadFile(settingPath)
+	if err != nil {
+		return nil
+	}
+
+	var cfg struct {
+		Env map[string]any `json:"env"`
+	}
+	if err := json.Unmarshal(data, &cfg); err != nil {
+		return nil
+	}
+	if len(cfg.Env) == 0 {
+		return nil
+	}
+
+	env := make(map[string]string, len(cfg.Env))
+	for k, v := range cfg.Env {
+		s, ok := v.(string)
+		if !ok {
+			continue
+		}
+		env[k] = s
+	}
+	if len(env) == 0 {
+		return nil
+	}
+	return env
+}
+
+func buildClaudeArgs(cfg *Config, targetArg string) []string {
 	if cfg == nil {
 		return nil
 	}
-	args := []string{"-p", "--dangerously-skip-permissions"}
-
-	// Only skip permissions when explicitly requested
-	// if cfg.SkipPermissions {
-	// 	args = append(args, "--dangerously-skip-permissions")
-	// }
+	args := []string{"-p"}
+	if cfg.SkipPermissions {
+		args = append(args, "--dangerously-skip-permissions")
+	}

 	// Prevent infinite recursion: disable all setting sources (user, project, local)
 	// This ensures a clean execution environment without CLAUDE.md or skills that would trigger codeagent
@@ -60,6 +113,10 @@ func (GeminiBackend) Command() string {
 	return "gemini"
 }
 func (GeminiBackend) BuildArgs(cfg *Config, targetArg string) []string {
+	return buildGeminiArgs(cfg, targetArg)
+}
+
+func buildGeminiArgs(cfg *Config, targetArg string) []string {
 	if cfg == nil {
 		return nil
 	}
--- a/codeagent-wrapper/backend_test.go
+++ b/codeagent-wrapper/backend_test.go
@@ -1,6 +1,9 @@
 package main

 import (
+	"bytes"
+	"os"
+	"path/filepath"
 	"reflect"
 	"testing"
 )
@@ -8,16 +11,16 @@ import (
 func TestClaudeBuildArgs_ModesAndPermissions(t *testing.T) {
 	backend := ClaudeBackend{}

-	t.Run("new mode uses workdir without skip by default", func(t *testing.T) {
+	t.Run("new mode omits skip-permissions by default", func(t *testing.T) {
 		cfg := &Config{Mode: "new", WorkDir: "/repo"}
 		got := backend.BuildArgs(cfg, "todo")
-		want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"}
+		want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"}
 		if !reflect.DeepEqual(got, want) {
 			t.Fatalf("got %v, want %v", got, want)
 		}
 	})

-	t.Run("new mode opt-in skip permissions with default workdir", func(t *testing.T) {
+	t.Run("new mode can opt-in skip-permissions", func(t *testing.T) {
 		cfg := &Config{Mode: "new", SkipPermissions: true}
 		got := backend.BuildArgs(cfg, "-")
 		want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "-"}
@@ -26,10 +29,10 @@ func TestClaudeBuildArgs_ModesAndPermissions(t *testing.T) {
 		}
 	})

-	t.Run("resume mode uses session id and omits workdir", func(t *testing.T) {
+	t.Run("resume mode includes session id", func(t *testing.T) {
 		cfg := &Config{Mode: "resume", SessionID: "sid-123", WorkDir: "/ignored"}
 		got := backend.BuildArgs(cfg, "resume-task")
-		want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "-r", "sid-123", "--output-format", "stream-json", "--verbose", "resume-task"}
+		want := []string{"-p", "--setting-sources", "", "-r", "sid-123", "--output-format", "stream-json", "--verbose", "resume-task"}
 		if !reflect.DeepEqual(got, want) {
 			t.Fatalf("got %v, want %v", got, want)
 		}
@@ -38,7 +41,16 @@ func TestClaudeBuildArgs_ModesAndPermissions(t *testing.T) {
 	t.Run("resume mode without session still returns base flags", func(t *testing.T) {
 		cfg := &Config{Mode: "resume", WorkDir: "/ignored"}
 		got := backend.BuildArgs(cfg, "follow-up")
-		want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "follow-up"}
+		want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "follow-up"}
+		if !reflect.DeepEqual(got, want) {
+			t.Fatalf("got %v, want %v", got, want)
+		}
+	})
+
+	t.Run("resume mode can opt-in skip permissions", func(t *testing.T) {
+		cfg := &Config{Mode: "resume", SessionID: "sid-123", SkipPermissions: true}
+		got := backend.BuildArgs(cfg, "resume-task")
+		want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "-r", "sid-123", "--output-format", "stream-json", "--verbose", "resume-task"}
 		if !reflect.DeepEqual(got, want) {
 			t.Fatalf("got %v, want %v", got, want)
 		}
@@ -89,7 +101,11 @@ func TestClaudeBuildArgs_GeminiAndCodexModes(t *testing.T) {
 		}
 	})

-	t.Run("codex build args passthrough remains intact", func(t *testing.T) {
+	t.Run("codex build args omits bypass flag by default", func(t *testing.T) {
+		const key = "CODEX_BYPASS_SANDBOX"
+		t.Cleanup(func() { os.Unsetenv(key) })
+		os.Unsetenv(key)
+
 		backend := CodexBackend{}
 		cfg := &Config{Mode: "new", WorkDir: "/tmp"}
 		got := backend.BuildArgs(cfg, "task")
@@ -98,6 +114,20 @@ func TestClaudeBuildArgs_GeminiAndCodexModes(t *testing.T) {
 			t.Fatalf("got %v, want %v", got, want)
 		}
 	})
+
+	t.Run("codex build args includes bypass flag when enabled", func(t *testing.T) {
+		const key = "CODEX_BYPASS_SANDBOX"
+		t.Cleanup(func() { os.Unsetenv(key) })
+		os.Setenv(key, "true")
+
+		backend := CodexBackend{}
+		cfg := &Config{Mode: "new", WorkDir: "/tmp"}
+		got := backend.BuildArgs(cfg, "task")
+		want := []string{"e", "--dangerously-bypass-approvals-and-sandbox", "--skip-git-repo-check", "-C", "/tmp", "--json", "task"}
+		if !reflect.DeepEqual(got, want) {
+			t.Fatalf("got %v, want %v", got, want)
+		}
+	})
 }

 func TestClaudeBuildArgs_BackendMetadata(t *testing.T) {
@@ -120,3 +150,64 @@ func TestClaudeBuildArgs_BackendMetadata(t *testing.T) {
 		}
 	}
 }
+
+func TestLoadMinimalEnvSettings(t *testing.T) {
+	home := t.TempDir()
+	t.Setenv("HOME", home)
+	t.Setenv("USERPROFILE", home)
+
+	t.Run("missing file returns empty", func(t *testing.T) {
+		if got := loadMinimalEnvSettings(); len(got) != 0 {
+			t.Fatalf("got %v, want empty", got)
+		}
+	})
+
+	t.Run("valid env returns string map", func(t *testing.T) {
+		dir := filepath.Join(home, ".claude")
+		if err := os.MkdirAll(dir, 0o755); err != nil {
+			t.Fatalf("MkdirAll: %v", err)
+		}
+		path := filepath.Join(dir, "settings.json")
+		data := []byte(`{"env":{"ANTHROPIC_API_KEY":"secret","FOO":"bar"}}`)
+		if err := os.WriteFile(path, data, 0o600); err != nil {
+			t.Fatalf("WriteFile: %v", err)
+		}
+
+		got := loadMinimalEnvSettings()
+		if got["ANTHROPIC_API_KEY"] != "secret" || got["FOO"] != "bar" {
+			t.Fatalf("got %v, want keys present", got)
+		}
+	})
+
+	t.Run("non-string values are ignored", func(t *testing.T) {
+		dir := filepath.Join(home, ".claude")
+		path := filepath.Join(dir, "settings.json")
+		data := []byte(`{"env":{"GOOD":"ok","BAD":123,"ALSO_BAD":true}}`)
+		if err := os.WriteFile(path, data, 0o600); err != nil {
+			t.Fatalf("WriteFile: %v", err)
+		}
+
+		got := loadMinimalEnvSettings()
+		if got["GOOD"] != "ok" {
+			t.Fatalf("got %v, want GOOD=ok", got)
+		}
+		if _, ok := got["BAD"]; ok {
+			t.Fatalf("got %v, want BAD omitted", got)
+		}
+		if _, ok := got["ALSO_BAD"]; ok {
+			t.Fatalf("got %v, want ALSO_BAD omitted", got)
+		}
+	})
+
+	t.Run("oversized file returns empty", func(t *testing.T) {
+		dir := filepath.Join(home, ".claude")
+		path := filepath.Join(dir, "settings.json")
+		data := bytes.Repeat([]byte("a"), maxClaudeSettingsBytes+1)
+		if err := os.WriteFile(path, data, 0o600); err != nil {
+			t.Fatalf("WriteFile: %v", err)
+		}
+		if got := loadMinimalEnvSettings(); len(got) != 0 {
+			t.Fatalf("got %v, want empty", got)
+		}
+	})
+}
--- a/codeagent-wrapper/concurrent_stress_test.go
+++ b/codeagent-wrapper/concurrent_stress_test.go
@@ -13,6 +13,16 @@ import (
 	"time"
 )

+func stripTimestampPrefix(line string) string {
+	if !strings.HasPrefix(line, "[") {
+		return line
+	}
+	if idx := strings.Index(line, "] "); idx >= 0 {
+		return line[idx+2:]
+	}
+	return line
+}
+
 // TestConcurrentStressLogger 高并发压力测试
 func TestConcurrentStressLogger(t *testing.T) {
 	if testing.Short() {
@@ -79,7 +89,8 @@ func TestConcurrentStressLogger(t *testing.T) {
 	// 验证日志格式（纯文本，无前缀）
 	formatRE := regexp.MustCompile(`^goroutine-\d+-msg-\d+$`)
 	for i, line := range lines[:min(10, len(lines))] {
-		if !formatRE.MatchString(line) {
+		msg := stripTimestampPrefix(line)
+		if !formatRE.MatchString(msg) {
 			t.Errorf("line %d has invalid format: %s", i, line)
 		}
 	}
@@ -291,7 +302,7 @@ func TestLoggerOrderPreservation(t *testing.T) {
 	sequences := make(map[int][]int) // goroutine ID -> sequence numbers

 	for scanner.Scan() {
-		line := scanner.Text()
+		line := stripTimestampPrefix(scanner.Text())
 		var gid, seq int
 		// Parse format: G0-SEQ0001 (without INFO: prefix)
 		_, err := fmt.Sscanf(line, "G%d-SEQ%04d", &gid, &seq)
--- a/codeagent-wrapper/config.go
+++ b/codeagent-wrapper/config.go
@@ -49,7 +49,15 @@ type TaskResult struct {
 	SessionID string `json:"session_id"`
 	Error     string `json:"error"`
 	LogPath   string `json:"log_path"`
-	sharedLog bool
+	// Structured report fields
+	Coverage       string   `json:"coverage,omitempty"`        // extracted coverage percentage (e.g., "92%")
+	CoverageNum    float64  `json:"coverage_num,omitempty"`    // numeric coverage for comparison
+	CoverageTarget float64  `json:"coverage_target,omitempty"` // target coverage (default 90)
+	FilesChanged   []string `json:"files_changed,omitempty"`   // list of changed files
+	KeyOutput      string   `json:"key_output,omitempty"`      // brief summary of what was done
+	TestsPassed    int      `json:"tests_passed,omitempty"`    // number of tests passed
+	TestsFailed    int      `json:"tests_failed,omitempty"`    // number of tests failed
+	sharedLog      bool
 }

 var backendRegistry = map[string]Backend{
@@ -164,6 +172,9 @@ func parseParallelConfig(data []byte) (*ParallelConfig, error) {
 		if content == "" {
 			return nil, fmt.Errorf("task block #%d (%q) missing content", taskIndex, task.ID)
 		}
+		if task.Mode == "resume" && strings.TrimSpace(task.SessionID) == "" {
+			return nil, fmt.Errorf("task block #%d (%q) has empty session_id", taskIndex, task.ID)
+		}
 		if _, exists := seen[task.ID]; exists {
 			return nil, fmt.Errorf("task block #%d has duplicate id: %s", taskIndex, task.ID)
 		}
@@ -232,7 +243,10 @@ func parseArgs() (*Config, error) {
 			return nil, fmt.Errorf("resume mode requires: resume <session_id> <task>")
 		}
 		cfg.Mode = "resume"
-		cfg.SessionID = args[1]
+		cfg.SessionID = strings.TrimSpace(args[1])
+		if cfg.SessionID == "" {
+			return nil, fmt.Errorf("resume mode requires non-empty session_id")
+		}
 		cfg.Task = args[2]
 		cfg.ExplicitStdin = (args[2] == "-")
 		if len(args) > 3 {
--- a/codeagent-wrapper/executor.go
+++ b/codeagent-wrapper/executor.go
@@ -16,6 +16,8 @@ import (
 	"time"
 )

+const postMessageTerminateDelay = 1 * time.Second
+
 // commandRunner abstracts exec.Cmd for testability
 type commandRunner interface {
 	Start() error
@@ -24,6 +26,7 @@ type commandRunner interface {
 	StdinPipe() (io.WriteCloser, error)
 	SetStderr(io.Writer)
 	SetDir(string)
+	SetEnv(env map[string]string)
 	Process() processHandle
 }

@@ -79,6 +82,52 @@ func (r *realCmd) SetDir(dir string) {
 	}
 }

+func (r *realCmd) SetEnv(env map[string]string) {
+	if r == nil || r.cmd == nil || len(env) == 0 {
+		return
+	}
+
+	merged := make(map[string]string, len(env)+len(os.Environ()))
+	for _, kv := range os.Environ() {
+		if kv == "" {
+			continue
+		}
+		idx := strings.IndexByte(kv, '=')
+		if idx <= 0 {
+			continue
+		}
+		merged[kv[:idx]] = kv[idx+1:]
+	}
+	for _, kv := range r.cmd.Env {
+		if kv == "" {
+			continue
+		}
+		idx := strings.IndexByte(kv, '=')
+		if idx <= 0 {
+			continue
+		}
+		merged[kv[:idx]] = kv[idx+1:]
+	}
+	for k, v := range env {
+		if strings.TrimSpace(k) == "" {
+			continue
+		}
+		merged[k] = v
+	}
+
+	keys := make([]string, 0, len(merged))
+	for k := range merged {
+		keys = append(keys, k)
+	}
+	sort.Strings(keys)
+
+	out := make([]string, 0, len(keys))
+	for _, k := range keys {
+		out = append(out, k+"="+merged[k])
+	}
+	r.cmd.Env = out
+}
+
 func (r *realCmd) Process() processHandle {
 	if r == nil || r.cmd == nil || r.cmd.Process == nil {
 		return nil
@@ -462,68 +511,255 @@ func shouldSkipTask(task TaskSpec, failed map[string]TaskResult) (bool, string)
 	return true, fmt.Sprintf("skipped due to failed dependencies: %s", strings.Join(blocked, ","))
 }

-func generateFinalOutput(results []TaskResult) string {
-	var sb strings.Builder
+// getStatusSymbols returns status symbols based on ASCII mode.
+func getStatusSymbols() (success, warning, failed string) {
+	if os.Getenv("CODEAGENT_ASCII_MODE") == "true" {
+		return "PASS", "WARN", "FAIL"
+	}
+	return "✓", "⚠️", "✗"
+}

+func generateFinalOutput(results []TaskResult) string {
+	return generateFinalOutputWithMode(results, true) // default to summary mode
+}
+
+// generateFinalOutputWithMode generates output based on mode
+// summaryOnly=true: structured report - every token has value
+// summaryOnly=false: full output with complete messages (legacy behavior)
+func generateFinalOutputWithMode(results []TaskResult, summaryOnly bool) string {
+	var sb strings.Builder
+	successSymbol, warningSymbol, failedSymbol := getStatusSymbols()
+
+	reportCoverageTarget := defaultCoverageTarget
+	for _, res := range results {
+		if res.CoverageTarget > 0 {
+			reportCoverageTarget = res.CoverageTarget
+			break
+		}
+	}
+
+	// Count results by status
 	success := 0
 	failed := 0
+	belowTarget := 0
 	for _, res := range results {
 		if res.ExitCode == 0 && res.Error == "" {
 			success++
+			target := res.CoverageTarget
+			if target <= 0 {
+				target = reportCoverageTarget
+			}
+			if res.Coverage != "" && target > 0 && res.CoverageNum < target {
+				belowTarget++
+			}
 		} else {
 			failed++
 		}
 	}

-	sb.WriteString(fmt.Sprintf("=== Parallel Execution Summary ===\n"))
-	sb.WriteString(fmt.Sprintf("Total: %d | Success: %d | Failed: %d\n\n", len(results), success, failed))
+	if summaryOnly {
+		// Header
+		sb.WriteString("=== Execution Report ===\n")
+		sb.WriteString(fmt.Sprintf("%d tasks | %d passed | %d failed", len(results), success, failed))
+		if belowTarget > 0 {
+			sb.WriteString(fmt.Sprintf(" | %d below %.0f%%", belowTarget, reportCoverageTarget))
+		}
+		sb.WriteString("\n\n")
+
+		// Task Results - each task gets: Did + Files + Tests + Coverage
+		sb.WriteString("## Task Results\n")
+
+		for _, res := range results {
+			taskID := sanitizeOutput(res.TaskID)
+			coverage := sanitizeOutput(res.Coverage)
+			keyOutput := sanitizeOutput(res.KeyOutput)
+			logPath := sanitizeOutput(res.LogPath)
+			filesChanged := sanitizeOutput(strings.Join(res.FilesChanged, ", "))
+
+			target := res.CoverageTarget
+			if target <= 0 {
+				target = reportCoverageTarget
+			}
+
+			isSuccess := res.ExitCode == 0 && res.Error == ""
+			isBelowTarget := isSuccess && coverage != "" && target > 0 && res.CoverageNum < target
+
+			if isSuccess && !isBelowTarget {
+				// Passed task: one block with Did/Files/Tests
+				sb.WriteString(fmt.Sprintf("\n### %s %s", taskID, successSymbol))
+				if coverage != "" {
+					sb.WriteString(fmt.Sprintf(" %s", coverage))
+				}
+				sb.WriteString("\n")
+
+				if keyOutput != "" {
+					sb.WriteString(fmt.Sprintf("Did: %s\n", keyOutput))
+				}
+				if len(res.FilesChanged) > 0 {
+					sb.WriteString(fmt.Sprintf("Files: %s\n", filesChanged))
+				}
+				if res.TestsPassed > 0 {
+					sb.WriteString(fmt.Sprintf("Tests: %d passed\n", res.TestsPassed))
+				}
+				if logPath != "" {
+					sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
+				}
+
+			} else if isSuccess && isBelowTarget {
+				// Below target: add Gap info
+				sb.WriteString(fmt.Sprintf("\n### %s %s %s (below %.0f%%)\n", taskID, warningSymbol, coverage, target))
+
+				if keyOutput != "" {
+					sb.WriteString(fmt.Sprintf("Did: %s\n", keyOutput))
+				}
+				if len(res.FilesChanged) > 0 {
+					sb.WriteString(fmt.Sprintf("Files: %s\n", filesChanged))
+				}
+				if res.TestsPassed > 0 {
+					sb.WriteString(fmt.Sprintf("Tests: %d passed\n", res.TestsPassed))
+				}
+				// Extract what's missing from coverage
+				gap := sanitizeOutput(extractCoverageGap(res.Message))
+				if gap != "" {
+					sb.WriteString(fmt.Sprintf("Gap: %s\n", gap))
+				}
+				if logPath != "" {
+					sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
+				}

-	for _, res := range results {
-		sb.WriteString(fmt.Sprintf("--- Task: %s ---\n", res.TaskID))
-		if res.Error != "" {
-			sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\nError: %s\n", res.ExitCode, res.Error))
-		} else if res.ExitCode != 0 {
-			sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\n", res.ExitCode))
-		} else {
-			sb.WriteString("Status: SUCCESS\n")
-		}
-		if res.SessionID != "" {
-			sb.WriteString(fmt.Sprintf("Session: %s\n", res.SessionID))
-		}
-		if res.LogPath != "" {
-			if res.sharedLog {
-				sb.WriteString(fmt.Sprintf("Log: %s (shared)\n", res.LogPath))
 			} else {
-				sb.WriteString(fmt.Sprintf("Log: %s\n", res.LogPath))
+				// Failed task: show error detail
+				sb.WriteString(fmt.Sprintf("\n### %s %s FAILED\n", taskID, failedSymbol))
+				sb.WriteString(fmt.Sprintf("Exit code: %d\n", res.ExitCode))
+				if errText := sanitizeOutput(res.Error); errText != "" {
+					sb.WriteString(fmt.Sprintf("Error: %s\n", errText))
+				}
+				// Show context from output (last meaningful lines)
+				detail := sanitizeOutput(extractErrorDetail(res.Message, 300))
+				if detail != "" {
+					sb.WriteString(fmt.Sprintf("Detail: %s\n", detail))
+				}
+				if logPath != "" {
+					sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
+				}
 			}
 		}
-		if res.Message != "" {
-			sb.WriteString(fmt.Sprintf("\n%s\n", res.Message))
+
+		// Summary section
+		sb.WriteString("\n## Summary\n")
+		sb.WriteString(fmt.Sprintf("- %d/%d completed successfully\n", success, len(results)))
+
+		if belowTarget > 0 || failed > 0 {
+			var needFix []string
+			var needCoverage []string
+			for _, res := range results {
+				if res.ExitCode != 0 || res.Error != "" {
+					taskID := sanitizeOutput(res.TaskID)
+					reason := sanitizeOutput(res.Error)
+					if reason == "" && res.ExitCode != 0 {
+						reason = fmt.Sprintf("exit code %d", res.ExitCode)
+					}
+					reason = safeTruncate(reason, 50)
+					needFix = append(needFix, fmt.Sprintf("%s (%s)", taskID, reason))
+					continue
+				}
+
+				target := res.CoverageTarget
+				if target <= 0 {
+					target = reportCoverageTarget
+				}
+				if res.Coverage != "" && target > 0 && res.CoverageNum < target {
+					needCoverage = append(needCoverage, sanitizeOutput(res.TaskID))
+				}
+			}
+			if len(needFix) > 0 {
+				sb.WriteString(fmt.Sprintf("- Fix: %s\n", strings.Join(needFix, ", ")))
+			}
+			if len(needCoverage) > 0 {
+				sb.WriteString(fmt.Sprintf("- Coverage: %s\n", strings.Join(needCoverage, ", ")))
+			}
+		}
+
+	} else {
+		// Legacy full output mode
+		sb.WriteString("=== Parallel Execution Summary ===\n")
+		sb.WriteString(fmt.Sprintf("Total: %d | Success: %d | Failed: %d\n\n", len(results), success, failed))
+
+		for _, res := range results {
+			taskID := sanitizeOutput(res.TaskID)
+			sb.WriteString(fmt.Sprintf("--- Task: %s ---\n", taskID))
+			if res.Error != "" {
+				sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\nError: %s\n", res.ExitCode, sanitizeOutput(res.Error)))
+			} else if res.ExitCode != 0 {
+				sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\n", res.ExitCode))
+			} else {
+				sb.WriteString("Status: SUCCESS\n")
+			}
+			if res.Coverage != "" {
+				sb.WriteString(fmt.Sprintf("Coverage: %s\n", sanitizeOutput(res.Coverage)))
+			}
+			if res.SessionID != "" {
+				sb.WriteString(fmt.Sprintf("Session: %s\n", sanitizeOutput(res.SessionID)))
+			}
+			if res.LogPath != "" {
+				logPath := sanitizeOutput(res.LogPath)
+				if res.sharedLog {
+					sb.WriteString(fmt.Sprintf("Log: %s (shared)\n", logPath))
+				} else {
+					sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
+				}
+			}
+			if res.Message != "" {
+				message := sanitizeOutput(res.Message)
+				if message != "" {
+					sb.WriteString(fmt.Sprintf("\n%s\n", message))
+				}
+			}
+			sb.WriteString("\n")
 		}
-		sb.WriteString("\n")
 	}

 	return sb.String()
 }

 func buildCodexArgs(cfg *Config, targetArg string) []string {
-	if cfg.Mode == "resume" {
-		return []string{
-			"e",
-			"--skip-git-repo-check",
-			"--json",
-			"resume",
-			cfg.SessionID,
-			targetArg,
+	if cfg == nil {
+		panic("buildCodexArgs: nil config")
+	}
+
+	var resumeSessionID string
+	isResume := cfg.Mode == "resume"
+	if isResume {
+		resumeSessionID = strings.TrimSpace(cfg.SessionID)
+		if resumeSessionID == "" {
+			logError("invalid config: resume mode requires non-empty session_id")
+			isResume = false
 		}
 	}
-	return []string{
-		"e",
-		"--skip-git-repo-check",
+
+	args := []string{"e"}
+
+	if envFlagEnabled("CODEX_BYPASS_SANDBOX") {
+		logWarn("CODEX_BYPASS_SANDBOX=true: running without approval/sandbox protection")
+		args = append(args, "--dangerously-bypass-approvals-and-sandbox")
+	}
+
+	args = append(args, "--skip-git-repo-check")
+
+	if isResume {
+		return append(args,
+			"--json",
+			"resume",
+			resumeSessionID,
+			targetArg,
+		)
+	}
+
+	return append(args,
 		"-C", cfg.WorkDir,
 		"--json",
 		targetArg,
-	}
+	)
 }

 func runCodexTask(taskSpec TaskSpec, silent bool, timeoutSec int) TaskResult {
@@ -574,6 +810,12 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
 		cfg.WorkDir = defaultWorkdir
 	}

+	if cfg.Mode == "resume" && strings.TrimSpace(cfg.SessionID) == "" {
+		result.ExitCode = 1
+		result.Error = "resume mode requires non-empty session_id"
+		return result
+	}
+
 	useStdin := taskSpec.UseStdin
 	targetArg := taskSpec.Task
 	if useStdin {
@@ -673,6 +915,12 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe

 	cmd := newCommandRunner(ctx, commandName, codexArgs...)

+	if cfg.Backend == "claude" {
+		if env := loadMinimalEnvSettings(); len(env) > 0 {
+			cmd.SetEnv(env)
+		}
+	}
+
 	// For backends that don't support -C flag (claude, gemini), set working directory via cmd.Dir
 	// Codex passes workdir via -C flag, so we skip setting Dir for it to avoid conflicts
 	if cfg.Mode != "resume" && commandName != "codex" && cfg.WorkDir != "" {
@@ -729,6 +977,7 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
 	// Start parse goroutine BEFORE starting the command to avoid race condition
 	// where fast-completing commands close stdout before parser starts reading
 	messageSeen := make(chan struct{}, 1)
+	completeSeen := make(chan struct{}, 1)
 	parseCh := make(chan parseResult, 1)
 	go func() {
 		msg, tid := parseJSONStreamInternal(stdoutReader, logWarnFn, logInfoFn, func() {
@@ -736,7 +985,16 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
 			case messageSeen <- struct{}{}:
 			default:
 			}
+		}, func() {
+			select {
+			case completeSeen <- struct{}{}:
+			default:
+			}
 		})
+		select {
+		case completeSeen <- struct{}{}:
+		default:
+		}
 		parseCh <- parseResult{message: msg, threadID: tid}
 	}()

@@ -773,17 +1031,63 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
 	waitCh := make(chan error, 1)
 	go func() { waitCh <- cmd.Wait() }()

-	var waitErr error
-	var forceKillTimer *forceKillTimer
-	var ctxCancelled bool
+	var (
+		waitErr              error
+		forceKillTimer       *forceKillTimer
+		ctxCancelled         bool
+		messageTimer         *time.Timer
+		messageTimerCh       <-chan time.Time
+		forcedAfterComplete  bool
+		terminated           bool
+		messageSeenObserved  bool
+		completeSeenObserved bool
+	)

-	select {
-	case waitErr = <-waitCh:
-	case <-ctx.Done():
-		ctxCancelled = true
-		logErrorFn(cancelReason(commandName, ctx))
-		forceKillTimer = terminateCommandFn(cmd)
-		waitErr = <-waitCh
+waitLoop:
+	for {
+		select {
+		case waitErr = <-waitCh:
+			break waitLoop
+		case <-ctx.Done():
+			ctxCancelled = true
+			logErrorFn(cancelReason(commandName, ctx))
+			if !terminated {
+				if timer := terminateCommandFn(cmd); timer != nil {
+					forceKillTimer = timer
+					terminated = true
+				}
+			}
+			waitErr = <-waitCh
+			break waitLoop
+		case <-messageTimerCh:
+			forcedAfterComplete = true
+			messageTimerCh = nil
+			if !terminated {
+				logWarnFn(fmt.Sprintf("%s output parsed; terminating lingering backend", commandName))
+				if timer := terminateCommandFn(cmd); timer != nil {
+					forceKillTimer = timer
+					terminated = true
+				}
+			}
+		case <-completeSeen:
+			completeSeenObserved = true
+			if messageTimer != nil {
+				continue
+			}
+			messageTimer = time.NewTimer(postMessageTerminateDelay)
+			messageTimerCh = messageTimer.C
+		case <-messageSeen:
+			messageSeenObserved = true
+		}
+	}
+
+	if messageTimer != nil {
+		if !messageTimer.Stop() {
+			select {
+			case <-messageTimer.C:
+			default:
+			}
+		}
 	}

 	if forceKillTimer != nil {
@@ -791,10 +1095,14 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
 	}

 	var parsed parseResult
-	if ctxCancelled {
+	switch {
+	case ctxCancelled:
 		closeWithReason(stdout, stdoutCloseReasonCtx)
 		parsed = <-parseCh
-	} else {
+	case messageSeenObserved || completeSeenObserved:
+		closeWithReason(stdout, stdoutCloseReasonWait)
+		parsed = <-parseCh
+	default:
 		drainTimer := time.NewTimer(stdoutDrainTimeout)
 		defer drainTimer.Stop()

@@ -802,6 +1110,11 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
 		case parsed = <-parseCh:
 			closeWithReason(stdout, stdoutCloseReasonWait)
 		case <-messageSeen:
+			messageSeenObserved = true
+			closeWithReason(stdout, stdoutCloseReasonWait)
+			parsed = <-parseCh
+		case <-completeSeen:
+			completeSeenObserved = true
 			closeWithReason(stdout, stdoutCloseReasonWait)
 			parsed = <-parseCh
 		case <-drainTimer.C:
@@ -822,17 +1135,21 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
 	}

 	if waitErr != nil {
-		if exitErr, ok := waitErr.(*exec.ExitError); ok {
-			code := exitErr.ExitCode()
-			logErrorFn(fmt.Sprintf("%s exited with status %d", commandName, code))
-			result.ExitCode = code
-			result.Error = attachStderr(fmt.Sprintf("%s exited with status %d", commandName, code))
+		if forcedAfterComplete && parsed.message != "" {
+			logWarnFn(fmt.Sprintf("%s terminated after delivering output", commandName))
+		} else {
+			if exitErr, ok := waitErr.(*exec.ExitError); ok {
+				code := exitErr.ExitCode()
+				logErrorFn(fmt.Sprintf("%s exited with status %d", commandName, code))
+				result.ExitCode = code
+				result.Error = attachStderr(fmt.Sprintf("%s exited with status %d", commandName, code))
+				return result
+			}
+			logErrorFn(commandName + " error: " + waitErr.Error())
+			result.ExitCode = 1
+			result.Error = attachStderr(commandName + " error: " + waitErr.Error())
 			return result
 		}
-		logErrorFn(commandName + " error: " + waitErr.Error())
-		result.ExitCode = 1
-		result.Error = attachStderr(commandName + " error: " + waitErr.Error())
-		return result
 	}

 	message := parsed.message
--- a/codeagent-wrapper/executor_concurrent_test.go
+++ b/codeagent-wrapper/executor_concurrent_test.go
@@ -10,6 +10,7 @@ import (
 	"os"
 	"os/exec"
 	"path/filepath"
+	"slices"
 	"strings"
 	"sync"
 	"sync/atomic"
@@ -86,6 +87,7 @@ type execFakeRunner struct {
 	process         processHandle
 	stdin           io.WriteCloser
 	dir             string
+	env             map[string]string
 	waitErr         error
 	waitDelay       time.Duration
 	startErr        error
@@ -128,6 +130,17 @@ func (f *execFakeRunner) StdinPipe() (io.WriteCloser, error) {
 }
 func (f *execFakeRunner) SetStderr(io.Writer) {}
 func (f *execFakeRunner) SetDir(dir string)   { f.dir = dir }
+func (f *execFakeRunner) SetEnv(env map[string]string) {
+	if len(env) == 0 {
+		return
+	}
+	if f.env == nil {
+		f.env = make(map[string]string, len(env))
+	}
+	for k, v := range env {
+		f.env[k] = v
+	}
+}
 func (f *execFakeRunner) Process() processHandle {
 	if f.process != nil {
 		return f.process
@@ -244,6 +257,10 @@ func TestExecutorHelperCoverage(t *testing.T) {
 	})

 	t.Run("generateFinalOutputAndArgs", func(t *testing.T) {
+		const key = "CODEX_BYPASS_SANDBOX"
+		t.Cleanup(func() { os.Unsetenv(key) })
+		os.Unsetenv(key)
+
 		out := generateFinalOutput([]TaskResult{
 			{TaskID: "ok", ExitCode: 0},
 			{TaskID: "fail", ExitCode: 1, Error: "boom"},
@@ -251,21 +268,66 @@ func TestExecutorHelperCoverage(t *testing.T) {
 		if !strings.Contains(out, "ok") || !strings.Contains(out, "fail") {
 			t.Fatalf("unexpected summary output: %s", out)
 		}
+		// Test summary mode (default) - should have new format with ### headers
 		out = generateFinalOutput([]TaskResult{{TaskID: "rich", ExitCode: 0, SessionID: "sess", LogPath: "/tmp/log", Message: "hello"}})
+		if !strings.Contains(out, "### rich") {
+			t.Fatalf("summary output missing task header: %s", out)
+		}
+		// Test full output mode - should have Session and Message
+		out = generateFinalOutputWithMode([]TaskResult{{TaskID: "rich", ExitCode: 0, SessionID: "sess", LogPath: "/tmp/log", Message: "hello"}}, false)
 		if !strings.Contains(out, "Session: sess") || !strings.Contains(out, "Log: /tmp/log") || !strings.Contains(out, "hello") {
-			t.Fatalf("rich output missing fields: %s", out)
+			t.Fatalf("full output missing fields: %s", out)
 		}

 		args := buildCodexArgs(&Config{Mode: "new", WorkDir: "/tmp"}, "task")
-		if len(args) == 0 || args[3] != "/tmp" {
+		if !slices.Equal(args, []string{"e", "--skip-git-repo-check", "-C", "/tmp", "--json", "task"}) {
 			t.Fatalf("unexpected codex args: %+v", args)
 		}
 		args = buildCodexArgs(&Config{Mode: "resume", SessionID: "sess"}, "target")
-		if args[3] != "resume" || args[4] != "sess" {
+		if !slices.Equal(args, []string{"e", "--skip-git-repo-check", "--json", "resume", "sess", "target"}) {
 			t.Fatalf("unexpected resume args: %+v", args)
 		}
 	})

+	t.Run("generateFinalOutputASCIIMode", func(t *testing.T) {
+		t.Setenv("CODEAGENT_ASCII_MODE", "true")
+
+		results := []TaskResult{
+			{TaskID: "ok", ExitCode: 0, Coverage: "92%", CoverageNum: 92, CoverageTarget: 90, KeyOutput: "done"},
+			{TaskID: "warn", ExitCode: 0, Coverage: "80%", CoverageNum: 80, CoverageTarget: 90, KeyOutput: "did"},
+			{TaskID: "bad", ExitCode: 2, Error: "boom"},
+		}
+		out := generateFinalOutput(results)
+
+		for _, sym := range []string{"PASS", "WARN", "FAIL"} {
+			if !strings.Contains(out, sym) {
+				t.Fatalf("ASCII mode should include %q, got: %s", sym, out)
+			}
+		}
+		for _, sym := range []string{"✓", "⚠️", "✗"} {
+			if strings.Contains(out, sym) {
+				t.Fatalf("ASCII mode should not include %q, got: %s", sym, out)
+			}
+		}
+	})
+
+	t.Run("generateFinalOutputUnicodeMode", func(t *testing.T) {
+		t.Setenv("CODEAGENT_ASCII_MODE", "false")
+
+		results := []TaskResult{
+			{TaskID: "ok", ExitCode: 0, Coverage: "92%", CoverageNum: 92, CoverageTarget: 90, KeyOutput: "done"},
+			{TaskID: "warn", ExitCode: 0, Coverage: "80%", CoverageNum: 80, CoverageTarget: 90, KeyOutput: "did"},
+			{TaskID: "bad", ExitCode: 2, Error: "boom"},
+		}
+		out := generateFinalOutput(results)
+
+		for _, sym := range []string{"✓", "⚠️", "✗"} {
+			if !strings.Contains(out, sym) {
+				t.Fatalf("Unicode mode should include %q, got: %s", sym, out)
+			}
+		}
+	})
+
 	t.Run("executeConcurrentWrapper", func(t *testing.T) {
 		orig := runCodexTaskFn
 		defer func() { runCodexTaskFn = orig }()
@@ -298,6 +360,18 @@ func TestExecutorRunCodexTaskWithContext(t *testing.T) {
 	origRunner := newCommandRunner
 	defer func() { newCommandRunner = origRunner }()

+	t.Run("resumeMissingSessionID", func(t *testing.T) {
+		newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
+			t.Fatalf("unexpected command execution for invalid resume config")
+			return nil
+		}
+
+		res := runCodexTaskWithContext(context.Background(), TaskSpec{Task: "payload", WorkDir: ".", Mode: "resume"}, nil, nil, false, false, 1)
+		if res.ExitCode == 0 || !strings.Contains(res.Error, "session_id") {
+			t.Fatalf("expected validation error, got %+v", res)
+		}
+	})
+
 	t.Run("success", func(t *testing.T) {
 		var firstStdout *reasonReadCloser
 		newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
@@ -1082,9 +1156,10 @@ func TestExecutorExecuteConcurrentWithContextBranches(t *testing.T) {
 			}
 		}

-		summary := generateFinalOutput(results)
+		// Test full output mode for shared marker (summary mode doesn't show it)
+		summary := generateFinalOutputWithMode(results, false)
 		if !strings.Contains(summary, "(shared)") {
-			t.Fatalf("summary missing shared marker: %s", summary)
+			t.Fatalf("full output missing shared marker: %s", summary)
 		}

 		mainLogger.Flush()
--- a/codeagent-wrapper/logger.go
+++ b/codeagent-wrapper/logger.go
@@ -366,7 +366,8 @@ func (l *Logger) run() {
 	defer ticker.Stop()

 	writeEntry := func(entry logEntry) {
-		fmt.Fprintf(l.writer, "%s\n", entry.msg)
+		timestamp := time.Now().Format("2006-01-02 15:04:05.000")
+		fmt.Fprintf(l.writer, "[%s] %s\n", timestamp, entry.msg)

 		// Cache error/warn entries in memory for fast extraction
 		if entry.isError {
--- a/codeagent-wrapper/main.go
+++ b/codeagent-wrapper/main.go
@@ -14,14 +14,15 @@ import (
 )

 const (
-	version             = "5.2.5"
-	defaultWorkdir      = "."
-	defaultTimeout      = 7200 // seconds
-	codexLogLineLimit   = 1000
-	stdinSpecialChars   = "\n\\\"'`$"
-	stderrCaptureLimit  = 4 * 1024
-	defaultBackendName  = "codex"
-	defaultCodexCommand = "codex"
+	version               = "5.4.0"
+	defaultWorkdir        = "."
+	defaultTimeout        = 7200 // seconds (2 hours)
+	defaultCoverageTarget = 90.0
+	codexLogLineLimit     = 1000
+	stdinSpecialChars     = "\n\\\"'`$"
+	stderrCaptureLimit    = 4 * 1024
+	defaultBackendName    = "codex"
+	defaultCodexCommand   = "codex"

 	// stdout close reasons
 	stdoutCloseReasonWait  = "wait-done"
@@ -30,6 +31,8 @@ const (
 	stdoutDrainTimeout     = 100 * time.Millisecond
 )

+var useASCIIMode = os.Getenv("CODEAGENT_ASCII_MODE") == "true"
+
 // Test hooks for dependency injection
 var (
 	stdinReader  io.Reader = os.Stdin
@@ -175,6 +178,7 @@ func run() (exitCode int) {

 		if parallelIndex != -1 {
 			backendName := defaultBackendName
+			fullOutput := false
 			var extras []string

 			for i := 0; i < len(args); i++ {
@@ -182,6 +186,8 @@ func run() (exitCode int) {
 				switch {
 				case arg == "--parallel":
 					continue
+				case arg == "--full-output":
+					fullOutput = true
 				case arg == "--backend":
 					if i+1 >= len(args) {
 						fmt.Fprintln(os.Stderr, "ERROR: --backend flag requires a value")
@@ -202,11 +208,12 @@ func run() (exitCode int) {
 			}

 			if len(extras) > 0 {
-				fmt.Fprintln(os.Stderr, "ERROR: --parallel reads its task configuration from stdin; only --backend is allowed.")
+				fmt.Fprintln(os.Stderr, "ERROR: --parallel reads its task configuration from stdin; only --backend and --full-output are allowed.")
 				fmt.Fprintln(os.Stderr, "Usage examples:")
 				fmt.Fprintf(os.Stderr, "  %s --parallel < tasks.txt\n", name)
 				fmt.Fprintf(os.Stderr, "  echo '...' | %s --parallel\n", name)
 				fmt.Fprintf(os.Stderr, "  %s --parallel <<'EOF'\n", name)
+				fmt.Fprintf(os.Stderr, "  %s --parallel --full-output <<'EOF'  # include full task output\n", name)
 				return 1
 			}

@@ -244,7 +251,33 @@ func run() (exitCode int) {
 			}

 			results := executeConcurrent(layers, timeoutSec)
-			fmt.Println(generateFinalOutput(results))
+
+			// Extract structured report fields from each result
+			for i := range results {
+				results[i].CoverageTarget = defaultCoverageTarget
+				if results[i].Message == "" {
+					continue
+				}
+
+				lines := strings.Split(results[i].Message, "\n")
+
+				// Coverage extraction
+				results[i].Coverage = extractCoverageFromLines(lines)
+				results[i].CoverageNum = extractCoverageNum(results[i].Coverage)
+
+				// Files changed
+				results[i].FilesChanged = extractFilesChangedFromLines(lines)
+
+				// Test results
+				results[i].TestsPassed, results[i].TestsFailed = extractTestResultsFromLines(lines)
+
+				// Key output summary
+				results[i].KeyOutput = extractKeyOutputFromLines(lines, 150)
+			}
+
+			// Default: summary mode (context-efficient)
+			// --full-output: legacy full output mode
+			fmt.Println(generateFinalOutputWithMode(results, !fullOutput))

 			exitCode = 0
 			for _, res := range results {
@@ -447,16 +480,19 @@ Usage:
    %[1]s resume <session_id> "task" [workdir]
    %[1]s resume <session_id> - [workdir]
    %[1]s --parallel               Run tasks in parallel (config from stdin)
+    %[1]s --parallel --full-output Run tasks in parallel with full output (legacy)
    %[1]s --version
    %[1]s --help

 Parallel mode examples:
    %[1]s --parallel < tasks.txt
    echo '...' | %[1]s --parallel
+    %[1]s --parallel --full-output < tasks.txt
    %[1]s --parallel <<'EOF'

 Environment Variables:
-    CODEX_TIMEOUT  Timeout in milliseconds (default: 7200000)
+    CODEX_TIMEOUT         Timeout in milliseconds (default: 7200000)
+    CODEAGENT_ASCII_MODE  Use ASCII symbols instead of Unicode (PASS/WARN/FAIL)

 Exit Codes:
    0    Success
--- a/codeagent-wrapper/main_integration_test.go
+++ b/codeagent-wrapper/main_integration_test.go
@@ -46,10 +46,26 @@ func parseIntegrationOutput(t *testing.T, out string) integrationOutput {

 	lines := strings.Split(out, "\n")
 	var currentTask *TaskResult
+	inTaskResults := false

 	for _, line := range lines {
 		line = strings.TrimSpace(line)
-		if strings.HasPrefix(line, "Total:") {
+
+		// Parse new format header: "X tasks | Y passed | Z failed"
+		if strings.Contains(line, "tasks |") && strings.Contains(line, "passed |") {
+			parts := strings.Split(line, "|")
+			for _, p := range parts {
+				p = strings.TrimSpace(p)
+				if strings.HasSuffix(p, "tasks") {
+					fmt.Sscanf(p, "%d tasks", &payload.Summary.Total)
+				} else if strings.HasSuffix(p, "passed") {
+					fmt.Sscanf(p, "%d passed", &payload.Summary.Success)
+				} else if strings.HasSuffix(p, "failed") {
+					fmt.Sscanf(p, "%d failed", &payload.Summary.Failed)
+				}
+			}
+		} else if strings.HasPrefix(line, "Total:") {
+			// Legacy format: "Total: X | Success: Y | Failed: Z"
 			parts := strings.Split(line, "|")
 			for _, p := range parts {
 				p = strings.TrimSpace(p)
@@ -61,13 +77,72 @@ func parseIntegrationOutput(t *testing.T, out string) integrationOutput {
 					fmt.Sscanf(p, "Failed: %d", &payload.Summary.Failed)
 				}
 			}
+		} else if line == "## Task Results" {
+			inTaskResults = true
+		} else if line == "## Summary" {
+			// End of task results section
+			if currentTask != nil {
+				payload.Results = append(payload.Results, *currentTask)
+				currentTask = nil
+			}
+			inTaskResults = false
+		} else if inTaskResults && strings.HasPrefix(line, "### ") {
+			// New task: ### task-id ✓ 92% or ### task-id PASS 92% (ASCII mode)
+			if currentTask != nil {
+				payload.Results = append(payload.Results, *currentTask)
+			}
+			currentTask = &TaskResult{}
+
+			taskLine := strings.TrimPrefix(line, "### ")
+			success, warning, failed := getStatusSymbols()
+			// Parse different formats
+			if strings.Contains(taskLine, " "+success) {
+				parts := strings.Split(taskLine, " "+success)
+				currentTask.TaskID = strings.TrimSpace(parts[0])
+				currentTask.ExitCode = 0
+				// Extract coverage if present
+				if len(parts) > 1 {
+					coveragePart := strings.TrimSpace(parts[1])
+					if strings.HasSuffix(coveragePart, "%") {
+						currentTask.Coverage = coveragePart
+					}
+				}
+			} else if strings.Contains(taskLine, " "+warning) {
+				parts := strings.Split(taskLine, " "+warning)
+				currentTask.TaskID = strings.TrimSpace(parts[0])
+				currentTask.ExitCode = 0
+			} else if strings.Contains(taskLine, " "+failed) {
+				parts := strings.Split(taskLine, " "+failed)
+				currentTask.TaskID = strings.TrimSpace(parts[0])
+				currentTask.ExitCode = 1
+			} else {
+				currentTask.TaskID = taskLine
+			}
+		} else if currentTask != nil && inTaskResults {
+			// Parse task details
+			if strings.HasPrefix(line, "Exit code:") {
+				fmt.Sscanf(line, "Exit code: %d", &currentTask.ExitCode)
+			} else if strings.HasPrefix(line, "Error:") {
+				currentTask.Error = strings.TrimPrefix(line, "Error: ")
+			} else if strings.HasPrefix(line, "Log:") {
+				currentTask.LogPath = strings.TrimSpace(strings.TrimPrefix(line, "Log:"))
+			} else if strings.HasPrefix(line, "Did:") {
+				currentTask.KeyOutput = strings.TrimSpace(strings.TrimPrefix(line, "Did:"))
+			} else if strings.HasPrefix(line, "Detail:") {
+				// Error detail for failed tasks
+				if currentTask.Message == "" {
+					currentTask.Message = strings.TrimSpace(strings.TrimPrefix(line, "Detail:"))
+				}
+			}
 		} else if strings.HasPrefix(line, "--- Task:") {
+			// Legacy full output format
 			if currentTask != nil {
 				payload.Results = append(payload.Results, *currentTask)
 			}
 			currentTask = &TaskResult{}
 			currentTask.TaskID = strings.TrimSuffix(strings.TrimPrefix(line, "--- Task: "), " ---")
-		} else if currentTask != nil {
+		} else if currentTask != nil && !inTaskResults {
+			// Legacy format parsing
 			if strings.HasPrefix(line, "Status: SUCCESS") {
 				currentTask.ExitCode = 0
 			} else if strings.HasPrefix(line, "Status: FAILED") {
@@ -82,15 +157,11 @@ func parseIntegrationOutput(t *testing.T, out string) integrationOutput {
 				currentTask.SessionID = strings.TrimPrefix(line, "Session: ")
 			} else if strings.HasPrefix(line, "Log:") {
 				currentTask.LogPath = strings.TrimSpace(strings.TrimPrefix(line, "Log:"))
-			} else if line != "" && !strings.HasPrefix(line, "===") && !strings.HasPrefix(line, "---") {
-				if currentTask.Message != "" {
-					currentTask.Message += "\n"
-				}
-				currentTask.Message += line
 			}
 		}
 	}

+	// Handle last task
 	if currentTask != nil {
 		payload.Results = append(payload.Results, *currentTask)
 	}
@@ -343,9 +414,10 @@ task-beta`
 	}

 	for _, id := range []string{"alpha", "beta"} {
-		want := fmt.Sprintf("Log: %s", logPathFor(id))
-		if !strings.Contains(output, want) {
-			t.Fatalf("parallel output missing %q for %s:\n%s", want, id, output)
+		// Summary mode shows log paths in table format, not "Log: xxx"
+		logPath := logPathFor(id)
+		if !strings.Contains(output, logPath) {
+			t.Fatalf("parallel output missing log path %q for %s:\n%s", logPath, id, output)
 		}
 	}
 }
@@ -550,16 +622,16 @@ ok-e`
 	if resD.LogPath != logPathFor("D") || resE.LogPath != logPathFor("E") {
 		t.Fatalf("expected log paths for D/E, got D=%q E=%q", resD.LogPath, resE.LogPath)
 	}
+	// Summary mode shows log paths in table, verify they appear in output
 	for _, id := range []string{"A", "D", "E"} {
-		block := extractTaskBlock(t, output, id)
-		want := fmt.Sprintf("Log: %s", logPathFor(id))
-		if !strings.Contains(block, want) {
-			t.Fatalf("task %s block missing %q:\n%s", id, want, block)
+		logPath := logPathFor(id)
+		if !strings.Contains(output, logPath) {
+			t.Fatalf("task %s log path %q not found in output:\n%s", id, logPath, output)
 		}
 	}
-	blockB := extractTaskBlock(t, output, "B")
-	if strings.Contains(blockB, "Log:") {
-		t.Fatalf("skipped task B should not emit a log line:\n%s", blockB)
+	// Task B was skipped, should have "-" or empty log path in table
+	if resB.LogPath != "" {
+		t.Fatalf("skipped task B should have empty log path, got %q", resB.LogPath)
 	}
 }

--- a/codeagent-wrapper/main_test.go
+++ b/codeagent-wrapper/main_test.go
@@ -255,6 +255,10 @@ func (d *drainBlockingCmd) SetDir(dir string) {
 	d.inner.SetDir(dir)
 }

+func (d *drainBlockingCmd) SetEnv(env map[string]string) {
+	d.inner.SetEnv(env)
+}
+
 func (d *drainBlockingCmd) Process() processHandle {
 	return d.inner.Process()
 }
@@ -387,6 +391,8 @@ type fakeCmd struct {

 	stderr io.Writer

+	env map[string]string
+
 	waitDelay time.Duration
 	waitErr   error
 	startErr  error
@@ -511,6 +517,20 @@ func (f *fakeCmd) SetStderr(w io.Writer) {

 func (f *fakeCmd) SetDir(string) {}

+func (f *fakeCmd) SetEnv(env map[string]string) {
+	if len(env) == 0 {
+		return
+	}
+	f.mu.Lock()
+	defer f.mu.Unlock()
+	if f.env == nil {
+		f.env = make(map[string]string, len(env))
+	}
+	for k, v := range env {
+		f.env[k] = v
+	}
+}
+
 func (f *fakeCmd) Process() processHandle {
 	if f == nil {
 		return nil
@@ -879,6 +899,79 @@ func TestRunCodexTask_ContextTimeout(t *testing.T) {
 	}
 }

+func TestRunCodexTask_ForcesStopAfterCompletion(t *testing.T) {
+	defer resetTestHooks()
+	forceKillDelay.Store(0)
+
+	fake := newFakeCmd(fakeCmdConfig{
+		StdoutPlan: []fakeStdoutEvent{
+			{Data: `{"type":"item.completed","item":{"type":"agent_message","text":"done"}}` + "\n"},
+			{Data: `{"type":"thread.completed","thread_id":"tid"}` + "\n"},
+		},
+		KeepStdoutOpen:      true,
+		BlockWait:           true,
+		ReleaseWaitOnSignal: true,
+		ReleaseWaitOnKill:   true,
+	})
+
+	newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
+		return fake
+	}
+	buildCodexArgsFn = func(cfg *Config, targetArg string) []string { return []string{targetArg} }
+	codexCommand = "fake-cmd"
+
+	start := time.Now()
+	result := runCodexTaskWithContext(context.Background(), TaskSpec{Task: "done", WorkDir: defaultWorkdir}, nil, nil, false, false, 60)
+	duration := time.Since(start)
+
+	if result.ExitCode != 0 || result.Message != "done" {
+		t.Fatalf("unexpected result: %+v", result)
+	}
+	if duration > 2*time.Second {
+		t.Fatalf("runCodexTaskWithContext took too long: %v", duration)
+	}
+	if fake.process.SignalCount() == 0 {
+		t.Fatalf("expected SIGTERM to be sent, got %d", fake.process.SignalCount())
+	}
+}
+
+func TestRunCodexTask_DoesNotTerminateBeforeThreadCompleted(t *testing.T) {
+	defer resetTestHooks()
+	forceKillDelay.Store(0)
+
+	fake := newFakeCmd(fakeCmdConfig{
+		StdoutPlan: []fakeStdoutEvent{
+			{Data: `{"type":"item.completed","item":{"type":"agent_message","text":"intermediate"}}` + "\n"},
+			{Delay: 1100 * time.Millisecond, Data: `{"type":"item.completed","item":{"type":"agent_message","text":"final"}}` + "\n"},
+			{Data: `{"type":"thread.completed","thread_id":"tid"}` + "\n"},
+		},
+		KeepStdoutOpen:      true,
+		BlockWait:           true,
+		ReleaseWaitOnSignal: true,
+		ReleaseWaitOnKill:   true,
+	})
+
+	newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
+		return fake
+	}
+	buildCodexArgsFn = func(cfg *Config, targetArg string) []string { return []string{targetArg} }
+	codexCommand = "fake-cmd"
+
+	start := time.Now()
+	result := runCodexTaskWithContext(context.Background(), TaskSpec{Task: "done", WorkDir: defaultWorkdir}, nil, nil, false, false, 60)
+	duration := time.Since(start)
+
+	if result.ExitCode != 0 || result.Message != "final" {
+		t.Fatalf("unexpected result: %+v", result)
+	}
+	if duration > 5*time.Second {
+		t.Fatalf("runCodexTaskWithContext took too long: %v", duration)
+	}
+	if fake.process.SignalCount() == 0 {
+		t.Fatalf("expected SIGTERM to be sent, got %d", fake.process.SignalCount())
+	}
+}
+
 func TestBackendParseArgs_NewMode(t *testing.T) {
 	tests := []struct {
 		name    string
@@ -965,6 +1058,8 @@ func TestBackendParseArgs_ResumeMode(t *testing.T) {
 		},
 		{name: "resume missing session_id", args: []string{"codeagent-wrapper", "resume"}, wantErr: true},
 		{name: "resume missing task", args: []string{"codeagent-wrapper", "resume", "session-123"}, wantErr: true},
+		{name: "resume empty session_id", args: []string{"codeagent-wrapper", "resume", "", "task"}, wantErr: true},
+		{name: "resume whitespace session_id", args: []string{"codeagent-wrapper", "resume", "   ", "task"}, wantErr: true},
 	}

 	for _, tt := range tests {
@@ -1181,6 +1276,18 @@ do something`
 	}
 }

+func TestParallelParseConfig_EmptySessionID(t *testing.T) {
+	input := `---TASK---
+id: task-1
+session_id:
+---CONTENT---
+do something`
+
+	if _, err := parseParallelConfig([]byte(input)); err == nil {
+		t.Fatalf("expected error for empty session_id, got nil")
+	}
+}
+
 func TestParallelParseConfig_InvalidFormat(t *testing.T) {
 	if _, err := parseParallelConfig([]byte("invalid format")); err == nil {
 		t.Fatalf("expected error for invalid format, got nil")
@@ -1281,9 +1388,19 @@ func TestRunShouldUseStdin(t *testing.T) {
 }

 func TestRunBuildCodexArgs_NewMode(t *testing.T) {
+	const key = "CODEX_BYPASS_SANDBOX"
+	t.Cleanup(func() { os.Unsetenv(key) })
+	os.Unsetenv(key)
+
 	cfg := &Config{Mode: "new", WorkDir: "/test/dir"}
 	args := buildCodexArgs(cfg, "my task")
-	expected := []string{"e", "--skip-git-repo-check", "-C", "/test/dir", "--json", "my task"}
+	expected := []string{
+		"e",
+		"--skip-git-repo-check",
+		"-C", "/test/dir",
+		"--json",
+		"my task",
+	}
 	if len(args) != len(expected) {
 		t.Fatalf("len mismatch")
 	}
@@ -1295,9 +1412,20 @@ func TestRunBuildCodexArgs_NewMode(t *testing.T) {
 }

 func TestRunBuildCodexArgs_ResumeMode(t *testing.T) {
+	const key = "CODEX_BYPASS_SANDBOX"
+	t.Cleanup(func() { os.Unsetenv(key) })
+	os.Unsetenv(key)
+
 	cfg := &Config{Mode: "resume", SessionID: "session-abc"}
 	args := buildCodexArgs(cfg, "-")
-	expected := []string{"e", "--skip-git-repo-check", "--json", "resume", "session-abc", "-"}
+	expected := []string{
+		"e",
+		"--skip-git-repo-check",
+		"--json",
+		"resume",
+		"session-abc",
+		"-",
+	}
 	if len(args) != len(expected) {
 		t.Fatalf("len mismatch")
 	}
@@ -1308,6 +1436,61 @@ func TestRunBuildCodexArgs_ResumeMode(t *testing.T) {
 	}
 }

+func TestRunBuildCodexArgs_ResumeMode_EmptySessionHandledGracefully(t *testing.T) {
+	const key = "CODEX_BYPASS_SANDBOX"
+	t.Cleanup(func() { os.Unsetenv(key) })
+	os.Unsetenv(key)
+
+	cfg := &Config{Mode: "resume", SessionID: "   ", WorkDir: "/test/dir"}
+	args := buildCodexArgs(cfg, "task")
+	expected := []string{"e", "--skip-git-repo-check", "-C", "/test/dir", "--json", "task"}
+	if len(args) != len(expected) {
+		t.Fatalf("len mismatch")
+	}
+	for i := range args {
+		if args[i] != expected[i] {
+			t.Fatalf("args[%d]=%s, want %s", i, args[i], expected[i])
+		}
+	}
+}
+
+func TestRunBuildCodexArgs_BypassSandboxEnvTrue(t *testing.T) {
+	defer resetTestHooks()
+	tempDir := t.TempDir()
+	t.Setenv("TMPDIR", tempDir)
+
+	logger, err := NewLogger()
+	if err != nil {
+		t.Fatalf("NewLogger() error = %v", err)
+	}
+	setLogger(logger)
+	defer closeLogger()
+
+	t.Setenv("CODEX_BYPASS_SANDBOX", "true")
+
+	cfg := &Config{Mode: "new", WorkDir: "/test/dir"}
+	args := buildCodexArgs(cfg, "my task")
+	found := false
+	for _, arg := range args {
+		if arg == "--dangerously-bypass-approvals-and-sandbox" {
+			found = true
+			break
+		}
+	}
+	if !found {
+		t.Fatalf("expected bypass flag in args, got %v", args)
+	}
+
+	logger.Flush()
+	data, err := os.ReadFile(logger.Path())
+	if err != nil {
+		t.Fatalf("failed to read log file: %v", err)
+	}
+	if !strings.Contains(string(data), "CODEX_BYPASS_SANDBOX=true") {
+		t.Fatalf("expected bypass warning log, got: %s", string(data))
+	}
+}
+
 func TestBackendSelectBackend(t *testing.T) {
 	tests := []struct {
 		name string
@@ -1363,7 +1546,13 @@ func TestBackendBuildArgs_CodexBackend(t *testing.T) {
 	backend := CodexBackend{}
 	cfg := &Config{Mode: "new", WorkDir: "/test/dir"}
 	got := backend.BuildArgs(cfg, "task")
-	want := []string{"e", "--skip-git-repo-check", "-C", "/test/dir", "--json", "task"}
+	want := []string{
+		"e",
+		"--skip-git-repo-check",
+		"-C", "/test/dir",
+		"--json",
+		"task",
+	}
 	if len(got) != len(want) {
 		t.Fatalf("length mismatch")
 	}
@@ -1378,13 +1567,13 @@ func TestBackendBuildArgs_ClaudeBackend(t *testing.T) {
 	backend := ClaudeBackend{}
 	cfg := &Config{Mode: "new", WorkDir: defaultWorkdir}
 	got := backend.BuildArgs(cfg, "todo")
-	want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"}
+	want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"}
 	if len(got) != len(want) {
-		t.Fatalf("length mismatch")
+		t.Fatalf("args length=%d, want %d: %v", len(got), len(want), got)
 	}
 	for i := range want {
 		if got[i] != want[i] {
-			t.Fatalf("index %d got %s want %s", i, got[i], want[i])
+			t.Fatalf("index %d got %q want %q (args=%v)", i, got[i], want[i], got)
 		}
 	}

@@ -1399,19 +1588,15 @@ func TestClaudeBackendBuildArgs_OutputValidation(t *testing.T) {
 	target := "ensure-flags"

 	args := backend.BuildArgs(cfg, target)
-	expectedPrefix := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose"}
-
-	if len(args) != len(expectedPrefix)+1 {
-		t.Fatalf("args length=%d, want %d", len(args), len(expectedPrefix)+1)
+	want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", target}
+	if len(args) != len(want) {
+		t.Fatalf("args length=%d, want %d: %v", len(args), len(want), args)
 	}
-	for i, val := range expectedPrefix {
-		if args[i] != val {
-			t.Fatalf("args[%d]=%q, want %q", i, args[i], val)
+	for i := range want {
+		if args[i] != want[i] {
+			t.Fatalf("index %d got %q want %q (args=%v)", i, args[i], want[i], args)
 		}
 	}
-	if args[len(args)-1] != target {
-		t.Fatalf("last arg=%q, want target %q", args[len(args)-1], target)
-	}
 }

 func TestBackendBuildArgs_GeminiBackend(t *testing.T) {
@@ -1650,7 +1835,7 @@ func TestBackendParseJSONStream_GeminiEvents_OnMessageTriggeredOnStatus(t *testi
 	var called int
 	message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
 		called++
-	})
+	}, nil)

 	if message != "Hi there" {
 		t.Fatalf("message=%q, want %q", message, "Hi there")
@@ -1679,7 +1864,7 @@ func TestBackendParseJSONStream_OnMessage(t *testing.T) {
 	var called int
 	message, threadID := parseJSONStreamInternal(strings.NewReader(`{"type":"item.completed","item":{"type":"agent_message","text":"hook"}}`), nil, nil, func() {
 		called++
-	})
+	}, nil)
 	if message != "hook" {
 		t.Fatalf("message = %q, want hook", message)
 	}
@@ -1691,10 +1876,86 @@ func TestBackendParseJSONStream_OnMessage(t *testing.T) {
 	}
 }

+func TestBackendParseJSONStream_OnComplete_CodexThreadCompleted(t *testing.T) {
+	input := `{"type":"item.completed","item":{"type":"agent_message","text":"first"}}` + "\n" +
+		`{"type":"item.completed","item":{"type":"agent_message","text":"second"}}` + "\n" +
+		`{"type":"thread.completed","thread_id":"t-1"}`
+
+	var onMessageCalls int
+	var onCompleteCalls int
+	message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
+		onMessageCalls++
+	}, func() {
+		onCompleteCalls++
+	})
+	if message != "second" {
+		t.Fatalf("message = %q, want second", message)
+	}
+	if threadID != "t-1" {
+		t.Fatalf("threadID = %q, want t-1", threadID)
+	}
+	if onMessageCalls != 2 {
+		t.Fatalf("onMessage calls = %d, want 2", onMessageCalls)
+	}
+	if onCompleteCalls != 1 {
+		t.Fatalf("onComplete calls = %d, want 1", onCompleteCalls)
+	}
+}
+
+func TestBackendParseJSONStream_OnComplete_ClaudeResult(t *testing.T) {
+	input := `{"type":"message","subtype":"stream","session_id":"s-1"}` + "\n" +
+		`{"type":"result","result":"OK","session_id":"s-1"}`
+
+	var onMessageCalls int
+	var onCompleteCalls int
+	message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
+		onMessageCalls++
+	}, func() {
+		onCompleteCalls++
+	})
+	if message != "OK" {
+		t.Fatalf("message = %q, want OK", message)
+	}
+	if threadID != "s-1" {
+		t.Fatalf("threadID = %q, want s-1", threadID)
+	}
+	if onMessageCalls != 1 {
+		t.Fatalf("onMessage calls = %d, want 1", onMessageCalls)
+	}
+	if onCompleteCalls != 1 {
+		t.Fatalf("onComplete calls = %d, want 1", onCompleteCalls)
+	}
+}
+
+func TestBackendParseJSONStream_OnComplete_GeminiTerminalResultStatus(t *testing.T) {
+	input := `{"type":"message","role":"assistant","content":"Hi","delta":true,"session_id":"g-1"}` + "\n" +
+		`{"type":"result","status":"success","session_id":"g-1"}`
+
+	var onMessageCalls int
+	var onCompleteCalls int
+	message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
+		onMessageCalls++
+	}, func() {
+		onCompleteCalls++
+	})
+	if message != "Hi" {
+		t.Fatalf("message = %q, want Hi", message)
+	}
+	if threadID != "g-1" {
+		t.Fatalf("threadID = %q, want g-1", threadID)
+	}
+	if onMessageCalls != 1 {
+		t.Fatalf("onMessage calls = %d, want 1", onMessageCalls)
+	}
+	if onCompleteCalls != 1 {
+		t.Fatalf("onComplete calls = %d, want 1", onCompleteCalls)
+	}
+}
+
 func TestBackendParseJSONStream_ScannerError(t *testing.T) {
 	var warnings []string
 	warnFn := func(msg string) { warnings = append(warnings, msg) }
-	message, threadID := parseJSONStreamInternal(errReader{err: errors.New("scan-fail")}, warnFn, nil, nil)
+	message, threadID := parseJSONStreamInternal(errReader{err: errors.New("scan-fail")}, warnFn, nil, nil, nil)
 	if message != "" || threadID != "" {
 		t.Fatalf("expected empty output on scanner error, got message=%q threadID=%q", message, threadID)
 	}
@@ -2372,14 +2633,17 @@ func TestRunGenerateFinalOutput(t *testing.T) {
 	if out == "" {
 		t.Fatalf("generateFinalOutput() returned empty string")
 	}
-	if !strings.Contains(out, "Total: 3") || !strings.Contains(out, "Success: 2") || !strings.Contains(out, "Failed: 1") {
+	// New format: "X tasks | Y passed | Z failed"
+	if !strings.Contains(out, "3 tasks") || !strings.Contains(out, "2 passed") || !strings.Contains(out, "1 failed") {
 		t.Fatalf("summary missing, got %q", out)
 	}
-	if !strings.Contains(out, "Task: a") || !strings.Contains(out, "Task: b") {
-		t.Fatalf("task entries missing")
+	// New format uses ### task-id for each task
+	if !strings.Contains(out, "### a") || !strings.Contains(out, "### b") {
+		t.Fatalf("task entries missing in structured format")
 	}
-	if strings.Contains(out, "Log:") {
-		t.Fatalf("unexpected log line when LogPath empty, got %q", out)
+	// Should have Summary section
+	if !strings.Contains(out, "## Summary") {
+		t.Fatalf("Summary section missing, got %q", out)
 	}
 }

@@ -2399,12 +2663,18 @@ func TestRunGenerateFinalOutput_LogPath(t *testing.T) {
 			LogPath:  "/tmp/log-b",
 		},
 	}
+	// Test summary mode (default) - should contain log paths
 	out := generateFinalOutput(results)
-	if !strings.Contains(out, "Session: sid\nLog: /tmp/log-a") {
-		t.Fatalf("output missing log line after session: %q", out)
+	if !strings.Contains(out, "/tmp/log-b") {
+		t.Fatalf("summary output missing log path for failed task: %q", out)
+	}
+	// Test full output mode - shows Session: and Log: lines
+	out = generateFinalOutputWithMode(results, false)
+	if !strings.Contains(out, "Session: sid") || !strings.Contains(out, "Log: /tmp/log-a") {
+		t.Fatalf("full output missing log line after session: %q", out)
 	}
 	if !strings.Contains(out, "Log: /tmp/log-b") {
-		t.Fatalf("output missing log line for failed task: %q", out)
+		t.Fatalf("full output missing log line for failed task: %q", out)
 	}
 }

@@ -2702,6 +2972,46 @@ test`
 	}
 }

+func TestRunParallelWithFullOutput(t *testing.T) {
+	defer resetTestHooks()
+	cleanupLogsFn = func() (CleanupStats, error) { return CleanupStats{}, nil }
+
+	oldArgs := os.Args
+	t.Cleanup(func() { os.Args = oldArgs })
+	os.Args = []string{"codeagent-wrapper", "--parallel", "--full-output"}
+
+	stdinReader = strings.NewReader(`---TASK---
+id: T1
+---CONTENT---
+noop`)
+	t.Cleanup(func() { stdinReader = os.Stdin })
+
+	orig := runCodexTaskFn
+	runCodexTaskFn = func(task TaskSpec, timeout int) TaskResult {
+		return TaskResult{TaskID: task.ID, ExitCode: 0, Message: "full output marker"}
+	}
+	t.Cleanup(func() { runCodexTaskFn = orig })
+
+	out := captureOutput(t, func() {
+		if code := run(); code != 0 {
+			t.Fatalf("run exit = %d, want 0", code)
+		}
+	})
+
+	if !strings.Contains(out, "=== Parallel Execution Summary ===") {
+		t.Fatalf("output missing full-output header, got %q", out)
+	}
+	if !strings.Contains(out, "--- Task: T1 ---") {
+		t.Fatalf("output missing task block, got %q", out)
+	}
+	if !strings.Contains(out, "full output marker") {
+		t.Fatalf("output missing task message, got %q", out)
+	}
+	if strings.Contains(out, "=== Execution Report ===") {
+		t.Fatalf("output should not include summary-only header, got %q", out)
+	}
+}
+
 func TestParallelInvalidBackend(t *testing.T) {
 	defer resetTestHooks()
 	cleanupLogsFn = func() (CleanupStats, error) { return CleanupStats{}, nil }
@@ -2756,7 +3066,9 @@ func TestVersionFlag(t *testing.T) {
 			t.Errorf("exit = %d, want 0", code)
 		}
 	})
-	want := "codeagent-wrapper version 5.2.5\n"
+
+	want := "codeagent-wrapper version 5.4.0\n"
+
 	if output != want {
 		t.Fatalf("output = %q, want %q", output, want)
 	}
@@ -2770,7 +3082,9 @@ func TestVersionShortFlag(t *testing.T) {
 			t.Errorf("exit = %d, want 0", code)
 		}
 	})
-	want := "codeagent-wrapper version 5.2.5\n"
+
+	want := "codeagent-wrapper version 5.4.0\n"
+
 	if output != want {
 		t.Fatalf("output = %q, want %q", output, want)
 	}
@@ -2784,7 +3098,9 @@ func TestVersionLegacyAlias(t *testing.T) {
 			t.Errorf("exit = %d, want 0", code)
 		}
 	})
-	want := "codex-wrapper version 5.2.5\n"
+
+	want := "codex-wrapper version 5.4.0\n"
+
 	if output != want {
 		t.Fatalf("output = %q, want %q", output, want)
 	}
--- a/codeagent-wrapper/parser.go
+++ b/codeagent-wrapper/parser.go
@@ -50,7 +50,7 @@ func parseJSONStreamWithWarn(r io.Reader, warnFn func(string)) (message, threadI
 }

 func parseJSONStreamWithLog(r io.Reader, warnFn func(string), infoFn func(string)) (message, threadID string) {
-	return parseJSONStreamInternal(r, warnFn, infoFn, nil)
+	return parseJSONStreamInternal(r, warnFn, infoFn, nil, nil)
 }

 const (
@@ -95,7 +95,7 @@ type ItemContent struct {
 	Text interface{} `json:"text"`
 }

-func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(string), onMessage func()) (message, threadID string) {
+func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(string), onMessage func(), onComplete func()) (message, threadID string) {
 	reader := bufio.NewReaderSize(r, jsonLineReaderSize)

 	if warnFn == nil {
@@ -111,6 +111,12 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
 		}
 	}

+	notifyComplete := func() {
+		if onComplete != nil {
+			onComplete()
+		}
+	}
+
 	totalEvents := 0

 	var (
@@ -158,6 +164,9 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
 			}
 		}
 		isClaude := event.Subtype != "" || event.Result != ""
+		if !isClaude && event.Type == "result" && event.SessionID != "" && event.Status == "" {
+			isClaude = true
+		}
 		isGemini := event.Role != "" || event.Delta != nil || event.Status != ""

 		// Handle Codex events
@@ -178,6 +187,13 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
 				threadID = event.ThreadID
 				infoFn(fmt.Sprintf("thread.started event thread_id=%s", threadID))

+			case "thread.completed":
+				if event.ThreadID != "" && threadID == "" {
+					threadID = event.ThreadID
+				}
+				infoFn(fmt.Sprintf("thread.completed event thread_id=%s", event.ThreadID))
+				notifyComplete()
+
 			case "item.completed":
 				var itemType string
 				if len(event.Item) > 0 {
@@ -221,6 +237,10 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
 				claudeMessage = event.Result
 				notifyMessage()
 			}
+
+			if event.Type == "result" {
+				notifyComplete()
+			}
 			continue
 		}

@@ -236,6 +256,10 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin

 			if event.Status != "" {
 				notifyMessage()
+
+				if event.Type == "result" && (event.Status == "success" || event.Status == "error" || event.Status == "complete" || event.Status == "failed") {
+					notifyComplete()
+				}
 			}

 			delta := false
--- a/codeagent-wrapper/parser_token_too_long_test.go
+++ b/codeagent-wrapper/parser_token_too_long_test.go
@@ -18,7 +18,7 @@ func TestParseJSONStream_SkipsOverlongLineAndContinues(t *testing.T) {
 	var warns []string
 	warnFn := func(msg string) { warns = append(warns, msg) }

-	gotMessage, gotThreadID := parseJSONStreamInternal(strings.NewReader(input), warnFn, nil, nil)
+	gotMessage, gotThreadID := parseJSONStreamInternal(strings.NewReader(input), warnFn, nil, nil, nil)
 	if gotMessage != "ok" {
 		t.Fatalf("message=%q, want %q (warns=%v)", gotMessage, "ok", warns)
 	}
--- a/codeagent-wrapper/utils.go
+++ b/codeagent-wrapper/utils.go
@@ -75,9 +75,9 @@ func getEnv(key, defaultValue string) string {
 }

 type logWriter struct {
-	prefix string
-	maxLen int
-	buf    bytes.Buffer
+	prefix  string
+	maxLen  int
+	buf     bytes.Buffer
 	dropped bool
 }

@@ -205,6 +205,55 @@ func truncate(s string, maxLen int) string {
 	return s[:maxLen] + "..."
 }

+// safeTruncate safely truncates string to maxLen, avoiding panic and UTF-8 corruption.
+func safeTruncate(s string, maxLen int) string {
+	if maxLen <= 0 || s == "" {
+		return ""
+	}
+
+	runes := []rune(s)
+	if len(runes) <= maxLen {
+		return s
+	}
+
+	if maxLen < 4 {
+		return string(runes[:1])
+	}
+
+	cutoff := maxLen - 3
+	if cutoff <= 0 {
+		return string(runes[:1])
+	}
+	if len(runes) <= cutoff {
+		return s
+	}
+	return string(runes[:cutoff]) + "..."
+}
+
+// sanitizeOutput removes ANSI escape sequences and control characters.
+func sanitizeOutput(s string) string {
+	var result strings.Builder
+	inEscape := false
+	for i := 0; i < len(s); i++ {
+		if s[i] == '\x1b' && i+1 < len(s) && s[i+1] == '[' {
+			inEscape = true
+			i++ // skip '['
+			continue
+		}
+		if inEscape {
+			if (s[i] >= 'A' && s[i] <= 'Z') || (s[i] >= 'a' && s[i] <= 'z') {
+				inEscape = false
+			}
+			continue
+		}
+		// Keep printable chars and common whitespace.
+		if s[i] >= 32 || s[i] == '\n' || s[i] == '\t' {
+			result.WriteByte(s[i])
+		}
+	}
+	return result.String()
+}
+
 func min(a, b int) int {
 	if a < b {
 		return a
@@ -223,3 +272,444 @@ func greet(name string) string {
 func farewell(name string) string {
 	return "goodbye " + name
 }
+
+// extractMessageSummary extracts a brief summary from task output
+// Returns first meaningful line or truncated content up to maxLen chars
+func extractMessageSummary(message string, maxLen int) string {
+	if message == "" || maxLen <= 0 {
+		return ""
+	}
+
+	// Try to find a meaningful summary line
+	lines := strings.Split(message, "\n")
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+		// Skip empty lines and common noise
+		if line == "" || strings.HasPrefix(line, "```") || strings.HasPrefix(line, "---") {
+			continue
+		}
+		// Found a meaningful line
+		return safeTruncate(line, maxLen)
+	}
+
+	// Fallback: truncate entire message
+	clean := strings.TrimSpace(message)
+	return safeTruncate(clean, maxLen)
+}
+
+// extractCoverageFromLines extracts coverage from pre-split lines.
+func extractCoverageFromLines(lines []string) string {
+	if len(lines) == 0 {
+		return ""
+	}
+
+	end := len(lines)
+	for end > 0 && strings.TrimSpace(lines[end-1]) == "" {
+		end--
+	}
+
+	if end == 1 {
+		trimmed := strings.TrimSpace(lines[0])
+		if strings.HasSuffix(trimmed, "%") {
+			if num, err := strconv.ParseFloat(strings.TrimSuffix(trimmed, "%"), 64); err == nil && num >= 0 && num <= 100 {
+				return trimmed
+			}
+		}
+	}
+
+	coverageKeywords := []string{"file", "stmt", "branch", "line", "coverage", "total"}
+
+	for _, line := range lines[:end] {
+		lower := strings.ToLower(line)
+
+		hasKeyword := false
+		tokens := strings.FieldsFunc(lower, func(r rune) bool { return r < 'a' || r > 'z' })
+		for _, token := range tokens {
+			for _, kw := range coverageKeywords {
+				if strings.HasPrefix(token, kw) {
+					hasKeyword = true
+					break
+				}
+			}
+			if hasKeyword {
+				break
+			}
+		}
+		if !hasKeyword {
+			continue
+		}
+		if !strings.Contains(line, "%") {
+			continue
+		}
+
+		// Extract percentage pattern: number followed by %
+		for i := 0; i < len(line); i++ {
+			if line[i] == '%' && i > 0 {
+				// Walk back to find the number
+				j := i - 1
+				for j >= 0 && (line[j] == '.' || (line[j] >= '0' && line[j] <= '9')) {
+					j--
+				}
+				if j < i-1 {
+					numStr := line[j+1 : i]
+					// Validate it's a reasonable percentage
+					if num, err := strconv.ParseFloat(numStr, 64); err == nil && num >= 0 && num <= 100 {
+						return numStr + "%"
+					}
+				}
+			}
+		}
+	}
+
+	return ""
+}
+
+// extractCoverage extracts coverage percentage from task output
+// Supports common formats: "Coverage: 92%", "92% coverage", "coverage 92%", "TOTAL 92%"
+func extractCoverage(message string) string {
+	if message == "" {
+		return ""
+	}
+
+	return extractCoverageFromLines(strings.Split(message, "\n"))
+}
+
+// extractCoverageNum extracts coverage as a numeric value for comparison
+func extractCoverageNum(coverage string) float64 {
+	if coverage == "" {
+		return 0
+	}
+	// Remove % sign and parse
+	numStr := strings.TrimSuffix(coverage, "%")
+	if num, err := strconv.ParseFloat(numStr, 64); err == nil {
+		return num
+	}
+	return 0
+}
+
+// extractFilesChangedFromLines extracts files from pre-split lines.
+func extractFilesChangedFromLines(lines []string) []string {
+	if len(lines) == 0 {
+		return nil
+	}
+
+	var files []string
+	seen := make(map[string]bool)
+	exts := []string{".ts", ".tsx", ".js", ".jsx", ".go", ".py", ".rs", ".java", ".vue", ".css", ".scss", ".md", ".json", ".yaml", ".yml", ".toml"}
+
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+
+		// Pattern 1: "Modified: path/to/file.ts" or "Created: path/to/file.ts"
+		matchedPrefix := false
+		for _, prefix := range []string{"Modified:", "Created:", "Updated:", "Edited:", "Wrote:", "Changed:"} {
+			if strings.HasPrefix(line, prefix) {
+				file := strings.TrimSpace(strings.TrimPrefix(line, prefix))
+				file = strings.Trim(file, "`,\"'()[],:")
+				file = strings.TrimPrefix(file, "@")
+				if file != "" && !seen[file] {
+					files = append(files, file)
+					seen[file] = true
+				}
+				matchedPrefix = true
+				break
+			}
+		}
+		if matchedPrefix {
+			continue
+		}
+
+		// Pattern 2: Tokens that look like file paths (allow root files, strip @ prefix).
+		parts := strings.Fields(line)
+		for _, part := range parts {
+			part = strings.Trim(part, "`,\"'()[],:")
+			part = strings.TrimPrefix(part, "@")
+			for _, ext := range exts {
+				if strings.HasSuffix(part, ext) && !seen[part] {
+					files = append(files, part)
+					seen[part] = true
+					break
+				}
+			}
+		}
+	}
+
+	// Limit to first 10 files to avoid bloat
+	if len(files) > 10 {
+		files = files[:10]
+	}
+
+	return files
+}
+
+// extractFilesChanged extracts list of changed files from task output
+// Looks for common patterns like "Modified: file.ts", "Created: file.ts", file paths in output
+func extractFilesChanged(message string) []string {
+	if message == "" {
+		return nil
+	}
+
+	return extractFilesChangedFromLines(strings.Split(message, "\n"))
+}
+
+// extractTestResultsFromLines extracts test results from pre-split lines.
+func extractTestResultsFromLines(lines []string) (passed, failed int) {
+	if len(lines) == 0 {
+		return 0, 0
+	}
+
+	// Common patterns:
+	// pytest: "12 passed, 2 failed"
+	// jest: "Tests: 2 failed, 12 passed"
+	// go: "ok ... 12 tests"
+
+	for _, line := range lines {
+		line = strings.ToLower(line)
+
+		// Look for test result lines
+		if !strings.Contains(line, "pass") && !strings.Contains(line, "fail") && !strings.Contains(line, "test") {
+			continue
+		}
+
+		// Extract numbers near "passed" or "pass"
+		if idx := strings.Index(line, "pass"); idx != -1 {
+			// Look for number before "pass"
+			num := extractNumberBefore(line, idx)
+			if num > 0 {
+				passed = num
+			}
+		}
+
+		// Extract numbers near "failed" or "fail"
+		if idx := strings.Index(line, "fail"); idx != -1 {
+			num := extractNumberBefore(line, idx)
+			if num > 0 {
+				failed = num
+			}
+		}
+
+		// go test style: "ok ... 12 tests"
+		if passed == 0 {
+			if idx := strings.Index(line, "test"); idx != -1 {
+				num := extractNumberBefore(line, idx)
+				if num > 0 {
+					passed = num
+				}
+			}
+		}
+
+		// If we found both, stop
+		if passed > 0 && failed > 0 {
+			break
+		}
+	}
+
+	return passed, failed
+}
+
+// extractTestResults extracts test pass/fail counts from task output
+func extractTestResults(message string) (passed, failed int) {
+	if message == "" {
+		return 0, 0
+	}
+
+	return extractTestResultsFromLines(strings.Split(message, "\n"))
+}
+
+// extractNumberBefore extracts a number that appears before the given index
+func extractNumberBefore(s string, idx int) int {
+	if idx <= 0 {
+		return 0
+	}
+
+	// Walk backwards to find digits
+	end := idx - 1
+	for end >= 0 && (s[end] == ' ' || s[end] == ':' || s[end] == ',') {
+		end--
+	}
+	if end < 0 {
+		return 0
+	}
+
+	start := end
+	for start >= 0 && s[start] >= '0' && s[start] <= '9' {
+		start--
+	}
+	start++
+
+	if start > end {
+		return 0
+	}
+
+	numStr := s[start : end+1]
+	if num, err := strconv.Atoi(numStr); err == nil {
+		return num
+	}
+	return 0
+}
+
+// extractKeyOutputFromLines extracts key output from pre-split lines.
+func extractKeyOutputFromLines(lines []string, maxLen int) string {
+	if len(lines) == 0 || maxLen <= 0 {
+		return ""
+	}
+
+	// Priority 1: Look for explicit summary lines
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+		lower := strings.ToLower(line)
+		if strings.HasPrefix(lower, "summary:") || strings.HasPrefix(lower, "completed:") ||
+			strings.HasPrefix(lower, "implemented:") || strings.HasPrefix(lower, "added:") ||
+			strings.HasPrefix(lower, "created:") || strings.HasPrefix(lower, "fixed:") {
+			content := line
+			for _, prefix := range []string{"Summary:", "Completed:", "Implemented:", "Added:", "Created:", "Fixed:",
+				"summary:", "completed:", "implemented:", "added:", "created:", "fixed:"} {
+				content = strings.TrimPrefix(content, prefix)
+			}
+			content = strings.TrimSpace(content)
+			if len(content) > 0 {
+				return safeTruncate(content, maxLen)
+			}
+		}
+	}
+
+	// Priority 2: First meaningful line (skip noise)
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+		if line == "" || strings.HasPrefix(line, "```") || strings.HasPrefix(line, "---") ||
+			strings.HasPrefix(line, "#") || strings.HasPrefix(line, "//") {
+			continue
+		}
+		// Skip very short lines (likely headers or markers)
+		if len(line) < 20 {
+			continue
+		}
+		return safeTruncate(line, maxLen)
+	}
+
+	// Fallback: truncate entire message
+	clean := strings.TrimSpace(strings.Join(lines, "\n"))
+	return safeTruncate(clean, maxLen)
+}
+
+// extractKeyOutput extracts a brief summary of what the task accomplished
+// Looks for summary lines, first meaningful sentence, or truncates message
+func extractKeyOutput(message string, maxLen int) string {
+	if message == "" || maxLen <= 0 {
+		return ""
+	}
+	return extractKeyOutputFromLines(strings.Split(message, "\n"), maxLen)
+}
+
+// extractCoverageGap extracts what's missing from coverage reports
+// Looks for uncovered lines, branches, or functions
+func extractCoverageGap(message string) string {
+	if message == "" {
+		return ""
+	}
+
+	lower := strings.ToLower(message)
+	lines := strings.Split(message, "\n")
+
+	// Look for uncovered/missing patterns
+	for _, line := range lines {
+		lineLower := strings.ToLower(line)
+		line = strings.TrimSpace(line)
+
+		// Common patterns for uncovered code
+		if strings.Contains(lineLower, "uncovered") ||
+			strings.Contains(lineLower, "not covered") ||
+			strings.Contains(lineLower, "missing coverage") ||
+			strings.Contains(lineLower, "lines not covered") {
+			if len(line) > 100 {
+				return line[:97] + "..."
+			}
+			return line
+		}
+
+		// Look for specific file:line patterns in coverage reports
+		if strings.Contains(lineLower, "branch") && strings.Contains(lineLower, "not taken") {
+			if len(line) > 100 {
+				return line[:97] + "..."
+			}
+			return line
+		}
+	}
+
+	// Look for function names that aren't covered
+	if strings.Contains(lower, "function") && strings.Contains(lower, "0%") {
+		for _, line := range lines {
+			if strings.Contains(strings.ToLower(line), "0%") && strings.Contains(line, "function") {
+				line = strings.TrimSpace(line)
+				if len(line) > 100 {
+					return line[:97] + "..."
+				}
+				return line
+			}
+		}
+	}
+
+	return ""
+}
+
+// extractErrorDetail extracts meaningful error context from task output
+// Returns the most relevant error information up to maxLen characters
+func extractErrorDetail(message string, maxLen int) string {
+	if message == "" || maxLen <= 0 {
+		return ""
+	}
+
+	lines := strings.Split(message, "\n")
+	var errorLines []string
+
+	// Look for error-related lines
+	for _, line := range lines {
+		line = strings.TrimSpace(line)
+		if line == "" {
+			continue
+		}
+
+		lower := strings.ToLower(line)
+
+		// Skip noise lines
+		if strings.HasPrefix(line, "at ") && strings.Contains(line, "(") {
+			// Stack trace line - only keep first one
+			if len(errorLines) > 0 && strings.HasPrefix(strings.ToLower(errorLines[len(errorLines)-1]), "at ") {
+				continue
+			}
+		}
+
+		// Prioritize error/fail lines
+		if strings.Contains(lower, "error") ||
+			strings.Contains(lower, "fail") ||
+			strings.Contains(lower, "exception") ||
+			strings.Contains(lower, "assert") ||
+			strings.Contains(lower, "expected") ||
+			strings.Contains(lower, "timeout") ||
+			strings.Contains(lower, "not found") ||
+			strings.Contains(lower, "cannot") ||
+			strings.Contains(lower, "undefined") ||
+			strings.HasPrefix(line, "FAIL") ||
+			strings.HasPrefix(line, "●") {
+			errorLines = append(errorLines, line)
+		}
+	}
+
+	if len(errorLines) == 0 {
+		// No specific error lines found, take last few lines
+		start := len(lines) - 5
+		if start < 0 {
+			start = 0
+		}
+		for _, line := range lines[start:] {
+			line = strings.TrimSpace(line)
+			if line != "" {
+				errorLines = append(errorLines, line)
+			}
+		}
+	}
+
+	// Join and truncate
+	result := strings.Join(errorLines, " | ")
+	return safeTruncate(result, maxLen)
+}
--- a/codeagent-wrapper/utils_test.go
+++ b/codeagent-wrapper/utils_test.go
@@ -0,0 +1,143 @@
+package main
+
+import (
+	"fmt"
+	"reflect"
+	"strings"
+	"testing"
+)
+
+func TestExtractCoverage(t *testing.T) {
+	tests := []struct {
+		name string
+		in   string
+		want string
+	}{
+		{"bare int", "92%", "92%"},
+		{"bare float", "92.5%", "92.5%"},
+		{"coverage prefix", "coverage: 92%", "92%"},
+		{"total prefix", "TOTAL 92%", "92%"},
+		{"all files", "All files 92%", "92%"},
+		{"empty", "", ""},
+		{"no number", "coverage: N/A", ""},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			if got := extractCoverage(tt.in); got != tt.want {
+				t.Fatalf("extractCoverage(%q) = %q, want %q", tt.in, got, tt.want)
+			}
+		})
+	}
+}
+
+func TestExtractTestResults(t *testing.T) {
+	tests := []struct {
+		name       string
+		in         string
+		wantPassed int
+		wantFailed int
+	}{
+		{"pytest one line", "12 passed, 2 failed", 12, 2},
+		{"pytest split lines", "12 passed\n2 failed", 12, 2},
+		{"jest format", "Tests: 2 failed, 12 passed, 14 total", 12, 2},
+		{"go test style count", "ok\texample.com/foo\t0.12s\t12 tests", 12, 0},
+		{"zero counts", "0 passed, 0 failed", 0, 0},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			passed, failed := extractTestResults(tt.in)
+			if passed != tt.wantPassed || failed != tt.wantFailed {
+				t.Fatalf("extractTestResults(%q) = (%d, %d), want (%d, %d)", tt.in, passed, failed, tt.wantPassed, tt.wantFailed)
+			}
+		})
+	}
+}
+
+func TestExtractFilesChanged(t *testing.T) {
+	tests := []struct {
+		name string
+		in   string
+		want []string
+	}{
+		{"root file", "Modified: main.go\n", []string{"main.go"}},
+		{"path file", "Created: codeagent-wrapper/utils.go\n", []string{"codeagent-wrapper/utils.go"}},
+		{"at prefix", "Updated: @codeagent-wrapper/main.go\n", []string{"codeagent-wrapper/main.go"}},
+		{"token scan", "Files: @main.go, @codeagent-wrapper/utils.go\n", []string{"main.go", "codeagent-wrapper/utils.go"}},
+		{"space path", "Modified: dir/with space/file.go\n", []string{"dir/with space/file.go"}},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			if got := extractFilesChanged(tt.in); !reflect.DeepEqual(got, tt.want) {
+				t.Fatalf("extractFilesChanged(%q) = %#v, want %#v", tt.in, got, tt.want)
+			}
+		})
+	}
+
+	t.Run("limits to first 10", func(t *testing.T) {
+		var b strings.Builder
+		for i := 0; i < 12; i++ {
+			fmt.Fprintf(&b, "Modified: file%d.go\n", i)
+		}
+		got := extractFilesChanged(b.String())
+		if len(got) != 10 {
+			t.Fatalf("len(files)=%d, want 10: %#v", len(got), got)
+		}
+		for i := 0; i < 10; i++ {
+			want := fmt.Sprintf("file%d.go", i)
+			if got[i] != want {
+				t.Fatalf("files[%d]=%q, want %q", i, got[i], want)
+			}
+		}
+	})
+}
+
+func TestSafeTruncate(t *testing.T) {
+	tests := []struct {
+		name   string
+		in     string
+		maxLen int
+		want   string
+	}{
+		{"empty", "", 4, ""},
+		{"zero maxLen", "hello", 0, ""},
+		{"one rune", "你好", 1, "你"},
+		{"two runes no truncate", "你好", 2, "你好"},
+		{"three runes no truncate", "你好", 3, "你好"},
+		{"two runes truncates long", "你好世界", 2, "你"},
+		{"three runes truncates long", "你好世界", 3, "你"},
+		{"four with ellipsis", "你好世界啊", 4, "你..."},
+		{"emoji", "🙂🙂🙂🙂🙂", 4, "🙂..."},
+		{"no truncate", "你好世界", 4, "你好世界"},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			if got := safeTruncate(tt.in, tt.maxLen); got != tt.want {
+				t.Fatalf("safeTruncate(%q, %d) = %q, want %q", tt.in, tt.maxLen, got, tt.want)
+			}
+		})
+	}
+}
+
+func TestSanitizeOutput(t *testing.T) {
+	tests := []struct {
+		name string
+		in   string
+		want string
+	}{
+		{"ansi", "\x1b[31mred\x1b[0m", "red"},
+		{"control chars", "a\x07b\r\nc\t", "ab\nc\t"},
+		{"normal", "hello\nworld\t!", "hello\nworld\t!"},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			if got := sanitizeOutput(tt.in); got != tt.want {
+				t.Fatalf("sanitizeOutput(%q) = %q, want %q", tt.in, got, tt.want)
+			}
+		})
+	}
+}
--- a/dev-workflow/README.md
+++ b/dev-workflow/README.md
@@ -9,42 +9,56 @@ A freshly designed lightweight development workflow with no legacy baggage, focu
 ```
 /dev trigger
  ↓
+AskUserQuestion (backend selection)
+  ↓
 AskUserQuestion (requirements clarification)
  ↓
-codeagent analysis (plan mode + UI auto-detection)
+codeagent analysis (plan mode + task typing + UI auto-detection)
  ↓
 dev-plan-generator (create dev doc)
  ↓
-codeagent concurrent development (2–5 tasks, backend split)
+codeagent concurrent development (2–5 tasks, backend routing)
  ↓
 codeagent testing & verification (≥90% coverage)
  ↓
 Done (generate summary)
 ```

-## The 6 Steps
+## Step 0 + The 6 Steps
+
+### 0. Select Allowed Backends (FIRST ACTION)
+- Use **AskUserQuestion** with multiSelect to ask which backends are allowed for this run
+- Options (user can select multiple):
+  - `codex` - Stable, high quality, best cost-performance (default for most tasks)
+  - `claude` - Fast, lightweight (for quick fixes and config changes)
+  - `gemini` - UI/UX specialist (for frontend styling and components)
+- If user selects ONLY `codex`, ALL subsequent tasks must use `codex` (including UI/quick-fix)

 ### 1. Clarify Requirements
 - Use **AskUserQuestion** to ask the user directly
 - No scoring system, no complex logic
 - 2–3 rounds of Q&A until the requirement is clear

-### 2. codeagent Analysis & UI Detection
+### 2. codeagent Analysis + Task Typing + UI Detection
 - Call codeagent to analyze the request in plan mode style
 - Extract: core functions, technical points, task list (2–5 items)
+- For each task, assign exactly one type: `default` / `ui` / `quick-fix`
 - UI auto-detection: needs UI work when task involves style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue); output yes/no plus evidence

 ### 3. Generate Dev Doc
 - Call the **dev-plan-generator** agent
 - Produce a single `dev-plan.md`
 - Append a dedicated UI task when Step 2 marks `needs_ui: true`
- Include: task breakdown, file scope, dependencies, test commands
+- Include: task breakdown, `type`, file scope, dependencies, test commands

 ### 4. Concurrent Development
 - Work from the task list in dev-plan.md
- Use codeagent per task with explicit backend selection:
-  - Backend/API/DB tasks → `--backend codex` (default)
-  - UI/style/component tasks → `--backend gemini` (enforced)
+- Route backend per task type (with user constraints + fallback):
+  - `default` → `codex`
+  - `ui` → `gemini` (enforced when allowed)
+  - `quick-fix` → `claude`
+  - Missing `type` → treat as `default`
+  - If the preferred backend is not allowed, fallback to an allowed backend by priority: `codex` → `claude` → `gemini`
 - Independent tasks → run in parallel
 - Conflicting tasks → run serially

@@ -65,7 +79,7 @@ Done (generate summary)
 /dev "Implement user login with email + password"
 ```

-**No options**, fixed workflow, works out of the box.
+No CLI flags required; workflow starts with an interactive backend selection.

 ## Output Structure

@@ -80,14 +94,14 @@ Only one file—minimal and clear.

 ### Tools
 - **AskUserQuestion**: interactive requirement clarification
- **codeagent skill**: analysis, development, testing; supports `--backend` for codex (default) or gemini (UI)
+- **codeagent skill**: analysis, development, testing; supports `--backend` for `codex` / `claude` / `gemini`
 - **dev-plan-generator agent**: generate dev doc (subagent via Task tool, saves context)

-## UI Auto-Detection & Backend Routing
+## Backend Selection & Routing
+- **Step 0**: user selects allowed backends; if `仅 codex`, all tasks use codex
 - **UI detection standard**: style files (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component code (.tsx, .jsx, .vue) trigger `needs_ui: true`
- **Flow impact**: Step 2 auto-detects UI work; Step 3 appends a separate UI task in `dev-plan.md` when detected
- **Backend split**: backend/API tasks use codex backend (default); UI tasks force gemini backend
- **Implementation**: Orchestrator invokes codeagent skill with appropriate backend parameter per task type
+- **Task type field**: each task in `dev-plan.md` must have `type: default|ui|quick-fix`
+- **Routing**: `default`→codex, `ui`→gemini, `quick-fix`→claude; if disallowed, fallback to an allowed backend by priority: codex→claude→gemini

 ## Key Features

@@ -102,9 +116,9 @@ Only one file—minimal and clear.
 - Steps are straightforward

 ### ✅ Concurrency
- 2–5 tasks in parallel
+- Tasks split based on natural functional boundaries
 - Auto-detect dependencies and conflicts
- codeagent executes independently
+- codeagent executes independently with optimal backend

 ### ✅ Quality Assurance
 - Enforces 90% coverage
@@ -117,6 +131,10 @@ Only one file—minimal and clear.
 # Trigger
 /dev "Add user login feature"

+# Step 0: Select backends
+Q: Which backends are allowed? (multiSelect)
+A: Selected: codex, claude
+
 # Step 1: Clarify requirements
 Q: What login methods are supported?
 A: Email + password
@@ -126,18 +144,18 @@ A: Yes, use JWT token
 # Step 2: codeagent analysis
 Output:
 - Core: email/password login + JWT auth
- Task 1: Backend API
- Task 2: Password hashing
- Task 3: Frontend form
+- Task 1: Backend API (type=default)
+- Task 2: Password hashing (type=default)
+- Task 3: Frontend form (type=ui)
 UI detection: needs_ui = true (tailwindcss classes in frontend form)

 # Step 3: Generate doc
-dev-plan.md generated with backend + UI tasks ✓
+dev-plan.md generated with typed tasks ✓

-# Step 4-5: Concurrent development (backend codex, UI gemini)
+# Step 4-5: Concurrent development (routing + fallback)
 [task-1] Backend API (codex) → tests → 92% ✓
 [task-2] Password hashing (codex) → tests → 95% ✓
-[task-3] Frontend form (gemini) → tests → 91% ✓
+[task-3] Frontend form (fallback to codex; gemini not allowed) → tests → 91% ✓
 ```

 ## Directory Structure
--- a/dev-workflow/agents/dev-plan-generator.md
+++ b/dev-workflow/agents/dev-plan-generator.md
@@ -12,7 +12,7 @@ You are a specialized Development Plan Document Generator. Your sole responsibil

 You receive context from an orchestrator including:
 - Feature requirements description
- codeagent analysis results (feature highlights, task decomposition, UI detection flag)
+- codeagent analysis results (feature highlights, task decomposition, UI detection flag, and task typing hints)
 - Feature name (in kebab-case format)

 Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
@@ -29,6 +29,7 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`

 ### Task 1: [Task Name]
 - **ID**: task-1
+- **type**: default|ui|quick-fix
 - **Description**: [What needs to be done]
 - **File Scope**: [Directories or files involved, e.g., src/auth/**, tests/auth/]
 - **Dependencies**: [None or depends on task-x]
@@ -38,7 +39,7 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
 ### Task 2: [Task Name]
 ...

-(2-5 tasks)
+(Tasks based on natural functional boundaries, typically 2-5)

 ## Acceptance Criteria
 - [ ] Feature point 1
@@ -53,9 +54,13 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`

 ## Generation Rules You Must Enforce

-1. **Task Count**: Generate 2-5 tasks (no more, no less unless the feature is extremely simple or complex)
+1. **Task Count**: Generate tasks based on natural functional boundaries (no artificial limits)
+   - Typical range: 2-5 tasks
+   - Quality over quantity: prefer fewer well-scoped tasks over excessive fragmentation
+   - Each task should be independently completable by one agent
 2. **Task Requirements**: Each task MUST include:
   - Clear ID (task-1, task-2, etc.)
+   - A single task type field: `type: default|ui|quick-fix`
   - Specific description of what needs to be done
   - Explicit file scope (directories or files affected)
   - Dependency declaration ("None" or "depends on task-x")
@@ -67,18 +72,23 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`

 ## Your Workflow

-1. **Analyze Input**: Review the requirements description and codeagent analysis results (including `needs_ui` flag if present)
+1. **Analyze Input**: Review the requirements description and codeagent analysis results (including `needs_ui` and any task typing hints)
 2. **Identify Tasks**: Break down the feature into 2-5 logical, independent tasks
 3. **Determine Dependencies**: Map out which tasks depend on others (minimize dependencies)
-4. **Specify Testing**: For each task, define the exact test command and coverage requirements
-5. **Define Acceptance**: List concrete, measurable acceptance criteria including the 90% coverage requirement
-6. **Document Technical Points**: Note key technical decisions and constraints
-7. **Write File**: Use the Write tool to create `./.claude/specs/{feature_name}/dev-plan.md`
+4. **Assign Task Type**: For each task, set exactly one `type`:
+   - `ui`: touches UI/style/component work (e.g., .css/.scss/.tsx/.jsx/.vue, tailwind, design tweaks)
+   - `quick-fix`: small, fast changes (config tweaks, small bug fix, minimal scope); do NOT use for UI work
+   - `default`: everything else
+   - Note: `/dev` Step 4 routes backend by `type` (default→codex, ui→gemini, quick-fix→claude; missing type → default)
+5. **Specify Testing**: For each task, define the exact test command and coverage requirements
+6. **Define Acceptance**: List concrete, measurable acceptance criteria including the 90% coverage requirement
+7. **Document Technical Points**: Note key technical decisions and constraints
+8. **Write File**: Use the Write tool to create `./.claude/specs/{feature_name}/dev-plan.md`

 ## Quality Checks Before Writing

 - [ ] Task count is between 2-5
- [ ] Every task has all 6 required fields (ID, Description, File Scope, Dependencies, Test Command, Test Focus)
+- [ ] Every task has all required fields (ID, type, Description, File Scope, Dependencies, Test Command, Test Focus)
 - [ ] Test commands include coverage parameters
 - [ ] Dependencies are explicitly stated
 - [ ] Acceptance criteria includes 90% coverage requirement
--- a/dev-workflow/commands/dev.md
+++ b/dev-workflow/commands/dev.md
@@ -1,28 +1,81 @@
 ---
-description: Extreme lightweight end-to-end development workflow with requirements clarification, parallel codeagent execution, and mandatory 90% test coverage
+description: Extreme lightweight end-to-end development workflow with requirements clarification, intelligent backend selection, parallel codeagent execution, and mandatory 90% test coverage
 ---

-
 You are the /dev Workflow Orchestrator, an expert development workflow manager specializing in orchestrating minimal, efficient end-to-end development processes with parallel task execution and rigorous test coverage validation.

+---
+
+## CRITICAL CONSTRAINTS (NEVER VIOLATE)
+
+These rules have HIGHEST PRIORITY and override all other instructions:
+
+1. **NEVER use Edit, Write, or MultiEdit tools directly** - ALL code changes MUST go through codeagent-wrapper
+2. **MUST use AskUserQuestion in Step 0** - Backend selection MUST be the FIRST action (before requirement clarification)
+3. **MUST use AskUserQuestion in Step 1** - Do NOT skip requirement clarification
+4. **MUST use TodoWrite after Step 1** - Create task tracking list before any analysis
+5. **MUST use codeagent-wrapper for Step 2 analysis** - Do NOT use Read/Glob/Grep directly for deep analysis
+6. **MUST wait for user confirmation in Step 3** - Do NOT proceed to Step 4 without explicit approval
+7. **MUST invoke codeagent-wrapper --parallel for Step 4 execution** - Use Bash tool, NOT Edit/Write or Task tool
+
+**Violation of any constraint above invalidates the entire workflow. Stop and restart if violated.**
+
+---
+
 **Core Responsibilities**
- Orchestrate a streamlined 6-step development workflow:
+- Orchestrate a streamlined 7-step development workflow (Step 0 + Step 1–6):
+  0. Backend selection (user constrained)
  1. Requirement clarification through targeted questioning
-  2. Technical analysis using codeagent
+  2. Technical analysis using codeagent-wrapper
  3. Development documentation generation
-  4. Parallel development execution
+  4. Parallel development execution (backend routing per task type)
  5. Coverage validation (≥90% requirement)
  6. Completion summary

 **Workflow Execution**
- **Step 1: Requirement Clarification**
-  - Use AskUserQuestion to clarify requirements directly
+- **Step 0: Backend Selection [MANDATORY - FIRST ACTION]**
+  - MUST use AskUserQuestion tool as the FIRST action with multiSelect enabled
+  - Ask which backends are allowed for this /dev run
+  - Options (user can select multiple):
+    - `codex` - Stable, high quality, best cost-performance (default for most tasks)
+    - `claude` - Fast, lightweight (for quick fixes and config changes)
+    - `gemini` - UI/UX specialist (for frontend styling and components)
+  - Store the selected backends as `allowed_backends` set for routing in Step 4
+  - Special rule: if user selects ONLY `codex`, then ALL subsequent tasks (including UI/quick-fix) MUST use `codex` (no exceptions)
+
+- **Step 1: Requirement Clarification [MANDATORY - DO NOT SKIP]**
+  - MUST use AskUserQuestion tool
  - Focus questions on functional boundaries, inputs/outputs, constraints, testing, and required unit-test coverage levels
  - Iterate 2-3 rounds until clear; rely on judgment; keep questions concise
+  - After clarification complete: MUST use TodoWrite to create task tracking list with workflow steps

- **Step 2: codeagent Deep Analysis (Plan Mode Style)**
+- **Step 2: codeagent-wrapper Deep Analysis (Plan Mode Style) [USE CODEAGENT-WRAPPER ONLY]**

-  Use codeagent Skill to perform deep analysis. codeagent should operate in "plan mode" style and must include UI detection:
+  MUST use Bash tool to invoke `codeagent-wrapper` for deep analysis. Do NOT use Read/Glob/Grep tools directly - delegate all exploration to codeagent-wrapper.
+
+  **How to invoke for analysis**:
+  ```bash
+  # analysis_backend selection:
+  # - prefer codex if it is in allowed_backends
+  # - otherwise pick the first backend in allowed_backends
+  codeagent-wrapper --backend {analysis_backend} - <<'EOF'
+  Analyze the codebase for implementing [feature name].
+
+  Requirements:
+  - [requirement 1]
+  - [requirement 2]
+
+  Deliverables:
+  1. Explore codebase structure and existing patterns
+  2. Evaluate implementation options with trade-offs
+  3. Make architectural decisions
+  4. Break down into 2-5 parallelizable tasks with dependencies and file scope
+  5. Classify each task with a single `type`: `default` / `ui` / `quick-fix`
+  6. Determine if UI work is needed (check for .css/.tsx/.vue files)
+
+  Output the analysis following the structure below.
+  EOF
+  ```

  **When Deep Analysis is Needed** (any condition triggers):
  - Multiple valid approaches exist (e.g., Redis vs in-memory vs file-based caching)
@@ -34,12 +87,12 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
  - During analysis, output whether the task needs UI work (yes/no) and the evidence
  - UI criteria: presence of style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue)

-  **What codeagent Does in Analysis Mode**:
+  **What the AI backend does in Analysis Mode** (when invoked via codeagent-wrapper):
  1. **Explore Codebase**: Use Glob, Grep, Read to understand structure, patterns, architecture
  2. **Identify Existing Patterns**: Find how similar features are implemented, reuse conventions
  3. **Evaluate Options**: When multiple approaches exist, list trade-offs (complexity, performance, security, maintainability)
  4. **Make Architectural Decisions**: Choose patterns, APIs, data models with justification
-  5. **Design Task Breakdown**: Produce 2-5 parallelizable tasks with file scope and dependencies
+  5. **Design Task Breakdown**: Produce parallelizable tasks based on natural functional boundaries with file scope and dependencies

  **Analysis Output Structure**:
  ```
@@ -56,7 +109,7 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
  [API design, data models, architecture choices made]

  ## Task Breakdown
-  [2-5 tasks with: ID, description, file scope, dependencies, test command]
+  [2-5 tasks with: ID, description, file scope, dependencies, test command, type(default|ui|quick-fix)]

  ## UI Determination
  needs_ui: [true/false]
@@ -70,39 +123,62 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s

 - **Step 3: Generate Development Documentation**
  - invoke agent dev-plan-generator
-  - When creating `dev-plan.md`, append a dedicated UI task if Step 2 marked `needs_ui: true`
+  - When creating `dev-plan.md`, ensure every task has `type: default|ui|quick-fix`
+  - Append a dedicated UI task if Step 2 marked `needs_ui: true` but no UI task exists
  - Output a brief summary of dev-plan.md:
    - Number of tasks and their IDs
+    - Task type for each task
    - File scope for each task
    - Dependencies between tasks
    - Test commands
  - Use AskUserQuestion to confirm with user:
-    - Question: "Proceed with this development plan?" (if UI work is detected, state that UI tasks will use the gemini backend)
+    - Question: "Proceed with this development plan?" (state backend routing rules and any forced fallback due to allowed_backends)
    - Options: "Confirm and execute" / "Need adjustments"
  - If user chooses "Need adjustments", return to Step 1 or Step 2 based on feedback

- **Step 4: Parallel Development Execution**
-  - For each task in `dev-plan.md`, invoke codeagent skill with task brief in HEREDOC format:
+- **Step 4: Parallel Development Execution [CODEAGENT-WRAPPER ONLY - NO DIRECT EDITS]**
+  - MUST use Bash tool to invoke `codeagent-wrapper --parallel` for ALL code changes
+  - NEVER use Edit, Write, MultiEdit, or Task tools to modify code directly
+  - Backend routing (must be deterministic and enforceable):
+    - Task field: `type: default|ui|quick-fix` (missing → treat as `default`)
+    - Preferred backend by type:
+      - `default` → `codex`
+      - `ui` → `gemini` (enforced when allowed)
+      - `quick-fix` → `claude`
+    - If user selected `仅 codex`: all tasks MUST use `codex`
+    - Otherwise, if preferred backend is not in `allowed_backends`, fallback to the first available backend by priority: `codex` → `claude` → `gemini`
+  - Build ONE `--parallel` config that includes all tasks in `dev-plan.md` and submit it once via Bash tool:
    ```bash
-    # Backend task (use codex backend - default)
-    codeagent-wrapper --backend codex - <<'EOF'
-    Task: [task-id]
+    # One shot submission - wrapper handles topology + concurrency
+    codeagent-wrapper --parallel <<'EOF'
+    ---TASK---
+    id: [task-id-1]
+    backend: [routed-backend-from-type-and-allowed_backends]
+    workdir: .
+    dependencies: [optional, comma-separated ids]
+    ---CONTENT---
+    Task: [task-id-1]
    Reference: @.claude/specs/{feature_name}/dev-plan.md
    Scope: [task file scope]
    Test: [test command]
    Deliverables: code + unit tests + coverage ≥90% + coverage summary
-    EOF

-    # UI task (use gemini backend - enforced)
-    codeagent-wrapper --backend gemini - <<'EOF'
-    Task: [task-id]
+    ---TASK---
+    id: [task-id-2]
+    backend: [routed-backend-from-type-and-allowed_backends]
+    workdir: .
+    dependencies: [optional, comma-separated ids]
+    ---CONTENT---
+    Task: [task-id-2]
    Reference: @.claude/specs/{feature_name}/dev-plan.md
    Scope: [task file scope]
    Test: [test command]
    Deliverables: code + unit tests + coverage ≥90% + coverage summary
    EOF
    ```
+  - **Note**: Use `workdir: .` (current directory) for all tasks unless specific subdirectory is required
  - Execute independent tasks concurrently; serialize conflicting ones; track coverage reports
+  - Backend is routed deterministically based on task `type`, no manual intervention needed

 - **Step 5: Coverage Validation**
  - Validate each task’s coverage:
@@ -113,13 +189,19 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
  - Provide completed task list, coverage per task, key file changes

 **Error Handling**
- codeagent failure: retry once, then log and continue
- Insufficient coverage: request more tests (max 2 rounds)
- Dependency conflicts: serialize automatically
+- **codeagent-wrapper failure**: Retry once with same input; if still fails, log error and ask user for guidance
+- **Insufficient coverage (<90%)**: Request more tests from the failed task (max 2 rounds); if still fails, report to user
+- **Dependency conflicts**:
+  - Circular dependencies: codeagent-wrapper will detect and fail with error; revise task breakdown to remove cycles
+  - Missing dependencies: Ensure all task IDs referenced in `dependencies` field exist
+- **Parallel execution timeout**: Individual tasks timeout after 2 hours (configurable via CODEX_TIMEOUT); failed tasks can be retried individually
+- **Backend unavailable**: If a routed backend is unavailable, fallback to another backend in `allowed_backends` (priority: codex → claude → gemini); if none works, fail with a clear error message

 **Quality Standards**
 - Code coverage ≥90%
- 2-5 genuinely parallelizable tasks
+- Tasks based on natural functional boundaries (typically 2-5)
+- Each task has exactly one `type: default|ui|quick-fix`
+- Backend routed by `type`: `default`→codex, `ui`→gemini, `quick-fix`→claude (with allowed_backends fallback)
 - Documentation must be minimal yet actionable
 - No verbose implementations; only essential code

--- a/docs/CODEAGENT-WRAPPER.md
+++ b/docs/CODEAGENT-WRAPPER.md
@@ -105,6 +105,7 @@ EOF
 Execute multiple tasks concurrently with dependency management:

 ```bash
+# Default: summary output (context-efficient, recommended)
 codeagent-wrapper --parallel <<'EOF'
 ---TASK---
 id: backend_1701234567
@@ -125,6 +126,47 @@ dependencies: backend_1701234567, frontend_1701234568
 ---CONTENT---
 add integration tests for user management flow
 EOF
+
+# Full output mode (for debugging, includes complete task messages)
+codeagent-wrapper --parallel --full-output <<'EOF'
+...
+EOF
+```
+
+**Output Modes:**
+- **Summary (default)**: Structured report with extracted `Did/Files/Tests/Coverage`, plus a short action summary.
+- **Full (`--full-output`)**: Complete task messages included. Use only for debugging.
+
+**Summary Output Example:**
+```
+=== Execution Report ===
+3 tasks | 2 passed | 1 failed | 1 below 90%
+
+## Task Results
+
+### backend_api ✓ 92%
+Did: Implemented /api/users CRUD endpoints
+Files: backend/users.go, backend/router.go
+Tests: 12 passed
+Log: /tmp/codeagent-xxx.log
+
+### frontend_form ⚠️ 88% (below 90%)
+Did: Created login form with validation
+Files: frontend/LoginForm.tsx
+Tests: 8 passed
+Gap: lines not covered: frontend/LoginForm.tsx:42-47
+Log: /tmp/codeagent-yyy.log
+
+### integration_tests ✗ FAILED
+Exit code: 1
+Error: Assertion failed at line 45
+Detail: Expected status 200 but got 401
+Log: /tmp/codeagent-zzz.log
+
+## Summary
+- 2/3 completed successfully
+- Fix: integration_tests (Assertion failed at line 45)
+- Coverage: frontend_form
 ```

 **Parallel Task Format:**
--- a/go.work
+++ b/go.work
@@ -0,0 +1,5 @@
+go 1.21
+
+use (
+	./codeagent-wrapper
+)
--- a/install.bat
+++ b/install.bat
@@ -46,17 +46,23 @@ echo.
 echo codeagent-wrapper installed successfully at:
 echo   %DEST%

-rem Automatically ensure %USERPROFILE%\bin is in the USER (HKCU) PATH
+rem Ensure %USERPROFILE%\bin is in PATH without duplicating entries
 rem 1) Read current user PATH from registry (REG_SZ or REG_EXPAND_SZ)
 set "USER_PATH_RAW="
-set "USER_PATH_TYPE="
 for /f "tokens=1,2,*" %%A in ('reg query "HKCU\Environment" /v Path 2^>nul ^| findstr /I /R "^ *Path  *REG_"') do (
-    set "USER_PATH_TYPE=%%B"
    set "USER_PATH_RAW=%%C"
 )
 rem Trim leading spaces from USER_PATH_RAW
 for /f "tokens=* delims= " %%D in ("!USER_PATH_RAW!") do set "USER_PATH_RAW=%%D"

+rem 2) Read current system PATH from registry (REG_SZ or REG_EXPAND_SZ)
+set "SYS_PATH_RAW="
+for /f "tokens=1,2,*" %%A in ('reg query "HKLM\System\CurrentControlSet\Control\Session Manager\Environment" /v Path 2^>nul ^| findstr /I /R "^ *Path  *REG_"') do (
+    set "SYS_PATH_RAW=%%C"
+)
+rem Trim leading spaces from SYS_PATH_RAW
+for /f "tokens=* delims= " %%D in ("!SYS_PATH_RAW!") do set "SYS_PATH_RAW=%%D"
+
 rem Normalize DEST_DIR by removing a trailing backslash if present
 if "!DEST_DIR:~-1!"=="\" set "DEST_DIR=!DEST_DIR:~0,-1!"

@@ -67,42 +73,63 @@ set "SEARCH_EXP2=;!DEST_DIR!\;"
 set "SEARCH_LIT=;!PCT!USERPROFILE!PCT!\bin;"
 set "SEARCH_LIT2=;!PCT!USERPROFILE!PCT!\bin\;"

-rem Prepare user PATH variants for containment tests
-set "CHECK_RAW=;!USER_PATH_RAW!;"
-set "USER_PATH_EXP=!USER_PATH_RAW!"
-if defined USER_PATH_EXP call set "USER_PATH_EXP=%%USER_PATH_EXP%%"
-set "CHECK_EXP=;!USER_PATH_EXP!;"
+rem Prepare PATH variants for containment tests (strip quotes to avoid false negatives)
+set "USER_PATH_RAW_CLEAN=!USER_PATH_RAW:"=!"
+set "SYS_PATH_RAW_CLEAN=!SYS_PATH_RAW:"=!"

-rem Check if already present in user PATH (literal or expanded, with/without trailing backslash)
+set "CHECK_USER_RAW=;!USER_PATH_RAW_CLEAN!;"
+set "USER_PATH_EXP=!USER_PATH_RAW_CLEAN!"
+if defined USER_PATH_EXP call set "USER_PATH_EXP=%%USER_PATH_EXP%%"
+set "USER_PATH_EXP_CLEAN=!USER_PATH_EXP:"=!"
+set "CHECK_USER_EXP=;!USER_PATH_EXP_CLEAN!;"
+
+set "CHECK_SYS_RAW=;!SYS_PATH_RAW_CLEAN!;"
+set "SYS_PATH_EXP=!SYS_PATH_RAW_CLEAN!"
+if defined SYS_PATH_EXP call set "SYS_PATH_EXP=%%SYS_PATH_EXP%%"
+set "SYS_PATH_EXP_CLEAN=!SYS_PATH_EXP:"=!"
+set "CHECK_SYS_EXP=;!SYS_PATH_EXP_CLEAN!;"
+
+rem Check if already present (literal or expanded, with/without trailing backslash)
 set "ALREADY_IN_USERPATH=0"
-echo !CHECK_RAW! | findstr /I /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul && set "ALREADY_IN_USERPATH=1"
+echo(!CHECK_USER_RAW! | findstr /I /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul && set "ALREADY_IN_USERPATH=1"
 if "!ALREADY_IN_USERPATH!"=="0" (
-    echo !CHECK_EXP! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" >nul && set "ALREADY_IN_USERPATH=1"
+    echo(!CHECK_USER_EXP! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" >nul && set "ALREADY_IN_USERPATH=1"
+)
+
+set "ALREADY_IN_SYSPATH=0"
+echo(!CHECK_SYS_RAW! | findstr /I /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul && set "ALREADY_IN_SYSPATH=1"
+if "!ALREADY_IN_SYSPATH!"=="0" (
+    echo(!CHECK_SYS_EXP! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" >nul && set "ALREADY_IN_SYSPATH=1"
 )

 if "!ALREADY_IN_USERPATH!"=="1" (
    echo User PATH already includes %%USERPROFILE%%\bin.
 ) else (
-    rem Not present: append to user PATH using setx without duplicating system PATH
-    if defined USER_PATH_RAW (
-        set "USER_PATH_NEW=!USER_PATH_RAW!"
-        if not "!USER_PATH_NEW:~-1!"==";" set "USER_PATH_NEW=!USER_PATH_NEW!;"
-        set "USER_PATH_NEW=!USER_PATH_NEW!!PCT!USERPROFILE!PCT!\bin"
+    if "!ALREADY_IN_SYSPATH!"=="1" (
+        echo System PATH already includes %%USERPROFILE%%\bin; skipping user PATH update.
    ) else (
-        set "USER_PATH_NEW=!PCT!USERPROFILE!PCT!\bin"
-    )
-    rem Persist update to HKCU\Environment\Path (user scope)
-    setx PATH "!USER_PATH_NEW!" >nul
-    if errorlevel 1 (
-        echo WARNING: Failed to append %%USERPROFILE%%\bin to your user PATH.
-    ) else (
-        echo Added %%USERPROFILE%%\bin to your user PATH.
+        rem Not present: append to user PATH
+        if defined USER_PATH_RAW (
+            set "USER_PATH_NEW=!USER_PATH_RAW!"
+            if not "!USER_PATH_NEW:~-1!"==";" set "USER_PATH_NEW=!USER_PATH_NEW!;"
+            set "USER_PATH_NEW=!USER_PATH_NEW!!PCT!USERPROFILE!PCT!\bin"
+        ) else (
+            set "USER_PATH_NEW=!PCT!USERPROFILE!PCT!\bin"
+        )
+        rem Persist update to HKCU\Environment\Path (user scope)
+        setx Path "!USER_PATH_NEW!" >nul
+        if errorlevel 1 (
+            echo WARNING: Failed to append %%USERPROFILE%%\bin to your user PATH.
+        ) else (
+            echo Added %%USERPROFILE%%\bin to your user PATH.
+        )
    )
 )

-rem Update current session PATH so codex-wrapper is immediately available
+rem Update current session PATH so codeagent-wrapper is immediately available
 set "CURPATH=;%PATH%;"
-echo !CURPATH! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul
+set "CURPATH_CLEAN=!CURPATH:"=!"
+echo(!CURPATH_CLEAN! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul
 if errorlevel 1 set "PATH=!DEST_DIR!;!PATH!"

 goto :cleanup
--- a/install.py
+++ b/install.py
@@ -17,7 +17,10 @@ from datetime import datetime
 from pathlib import Path
 from typing import Any, Dict, Iterable, List, Optional

-import jsonschema
+try:
+    import jsonschema
+except ImportError:  # pragma: no cover
+    jsonschema = None

 DEFAULT_INSTALL_DIR = "~/.claude"

@@ -87,6 +90,32 @@ def load_config(path: str) -> Dict[str, Any]:
    config_path = Path(path).expanduser().resolve()
    config = _load_json(config_path)

+    if jsonschema is None:
+        print(
+            "WARNING: python package 'jsonschema' is not installed; "
+            "skipping config validation. To enable validation run:\n"
+            "  python3 -m pip install jsonschema\n",
+            file=sys.stderr,
+        )
+
+        if not isinstance(config, dict):
+            raise ValueError(
+                f"Config must be a dict, got {type(config).__name__}. "
+                "Check your config.json syntax."
+            )
+
+        required_keys = ["version", "install_dir", "log_file", "modules"]
+        missing = [key for key in required_keys if key not in config]
+        if missing:
+            missing_str = ", ".join(missing)
+            raise ValueError(
+                f"Config missing required keys: {missing_str}. "
+                "Install jsonschema for better validation: "
+                "python3 -m pip install jsonschema"
+            )
+
+        return config
+
    schema_candidates = [
        config_path.parent / "config.schema.json",
        Path(__file__).resolve().with_name("config.schema.json"),
--- a/install.sh
+++ b/install.sh
@@ -34,23 +34,42 @@ if ! curl -fsSL "$URL" -o /tmp/codeagent-wrapper; then
    exit 1
 fi

-mkdir -p "$HOME/bin"
+INSTALL_DIR="${INSTALL_DIR:-$HOME/.claude}"
+BIN_DIR="${INSTALL_DIR}/bin"
+mkdir -p "$BIN_DIR"

-mv /tmp/codeagent-wrapper "$HOME/bin/codeagent-wrapper"
-chmod +x "$HOME/bin/codeagent-wrapper"
+mv /tmp/codeagent-wrapper "${BIN_DIR}/codeagent-wrapper"
+chmod +x "${BIN_DIR}/codeagent-wrapper"

-if "$HOME/bin/codeagent-wrapper" --version >/dev/null 2>&1; then
-    echo "codeagent-wrapper installed successfully to ~/bin/codeagent-wrapper"
+if "${BIN_DIR}/codeagent-wrapper" --version >/dev/null 2>&1; then
+    echo "codeagent-wrapper installed successfully to ${BIN_DIR}/codeagent-wrapper"
 else
    echo "ERROR: installation verification failed" >&2
    exit 1
 fi

-if [[ ":$PATH:" != *":$HOME/bin:"* ]]; then
+# Auto-add to shell config files with idempotency
+if [[ ":${PATH}:" != *":${BIN_DIR}:"* ]]; then
    echo ""
-    echo "WARNING: ~/bin is not in your PATH"
-    echo "Add this line to your ~/.bashrc or ~/.zshrc:"
-    echo ""
-    echo "    export PATH=\"\$HOME/bin:\$PATH\""
+    echo "WARNING: ${BIN_DIR} is not in your PATH"
+
+    # Detect shell config file
+    if [ -n "$ZSH_VERSION" ]; then
+        RC_FILE="$HOME/.zshrc"
+    else
+        RC_FILE="$HOME/.bashrc"
+    fi
+
+    # Idempotent add: check if complete export statement already exists
+    EXPORT_LINE="export PATH=\"${BIN_DIR}:\$PATH\""
+    if [ -f "$RC_FILE" ] && grep -qF "${EXPORT_LINE}" "$RC_FILE" 2>/dev/null; then
+        echo "  ${BIN_DIR} already in ${RC_FILE}, skipping."
+    else
+        echo "  Adding to ${RC_FILE}..."
+        echo "" >> "$RC_FILE"
+        echo "# Added by myclaude installer" >> "$RC_FILE"
+        echo "export PATH=\"${BIN_DIR}:\$PATH\"" >> "$RC_FILE"
+        echo "  Done. Run 'source ${RC_FILE}' or restart shell."
+    fi
    echo ""
 fi
--- a/skills/codeagent/SKILL.md
+++ b/skills/codeagent/SKILL.md
@@ -74,7 +74,7 @@ codeagent-wrapper --backend gemini "simple task"
 - `task` (required): Task description, supports `@file` references
 - `working_dir` (optional): Working directory (default: current)
 - `--backend` (optional): Select AI backend (codex/claude/gemini, default: codex)
-  - **Note**: Claude backend defaults to `--dangerously-skip-permissions` for automation compatibility
+  - **Note**: Claude backend only adds `--dangerously-skip-permissions` when explicitly enabled

 ## Return Format

@@ -101,11 +101,12 @@ EOF

 ## Parallel Execution

-**With global backend**:
+**Default (summary mode - context-efficient):**
 ```bash
-codeagent-wrapper --parallel --backend claude <<'EOF'
+codeagent-wrapper --parallel <<'EOF'
 ---TASK---
 id: task1
+backend: codex
 workdir: /path/to/dir
 ---CONTENT---
 task content
@@ -117,6 +118,17 @@ dependent task
 EOF
 ```

+**Full output mode (for debugging):**
+```bash
+codeagent-wrapper --parallel --full-output <<'EOF'
+...
+EOF
+```
+
+**Output Modes:**
+- **Summary (default)**: Structured report with changes, output, verification, and review summary.
+- **Full (`--full-output`)**: Complete task messages. Use only when debugging specific failures.
+
 **With per-task backend**:
 ```bash
 codeagent-wrapper --parallel <<'EOF'
@@ -147,9 +159,9 @@ Set `CODEAGENT_MAX_PARALLEL_WORKERS` to limit concurrent tasks (default: unlimit
 ## Environment Variables

 - `CODEX_TIMEOUT`: Override timeout in milliseconds (default: 7200000 = 2 hours)
- `CODEAGENT_SKIP_PERMISSIONS`: Control permission checks
-  - For **Claude** backend: Set to `true`/`1` to **disable** `--dangerously-skip-permissions` (default: enabled)
-  - For **Codex/Gemini** backends: Set to `true`/`1` to enable permission skipping (default: disabled)
+- `CODEAGENT_SKIP_PERMISSIONS`: Control Claude CLI permission checks
+  - For **Claude** backend: Set to `true`/`1` to add `--dangerously-skip-permissions` (default: disabled)
+  - For **Codex/Gemini** backends: Currently has no effect
 - `CODEAGENT_MAX_PARALLEL_WORKERS`: Limit concurrent tasks in parallel mode (default: unlimited, recommended: 8)

 ## Invocation Pattern
@@ -182,9 +194,8 @@ Bash tool parameters:

 ## Security Best Practices

- **Claude Backend**: Defaults to `--dangerously-skip-permissions` for automation workflows
-  - To enforce permission checks with Claude: Set `CODEAGENT_SKIP_PERMISSIONS=true`
- **Codex/Gemini Backends**: Permission checks enabled by default
+- **Claude Backend**: Permission checks enabled by default
+  - To skip checks: set `CODEAGENT_SKIP_PERMISSIONS=true` or pass `--skip-permissions`
 - **Concurrency Limits**: Set `CODEAGENT_MAX_PARALLEL_WORKERS` in production to prevent resource exhaustion
 - **Automation Context**: This wrapper is designed for AI-driven automation where permission prompts would block execution
Author	SHA1	Message	Date
cexll	61536d04e2	Merge branch 'master' into feat/intelligent-backend-selection 合并 master 分支的最新改动到 PR #61。冲突解决： - dev-workflow/commands/dev.md: 保留 multiSelect backend 选择逻辑 - 保留任务类型字段 type: default\|ui\|quick-fix - 保留 Backend 路由策略：default→codex, ui→gemini, quick-fix→claude - 修复 heredoc 示例格式合并的 master 改动包括： - codeagent-wrapper v5.4.0 structured execution report (#94) - 修复 PATH 重复条目问题 (#95) - ASCII 模式和性能优化 - 其他 bug 修复和文档更新 Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-25 22:24:15 +08:00
cexll	2856bf0c29	fix(dev-workflow): refactor backend selection to multiSelect mode 根据 PR review 反馈进行修复：核心改动： - Step 0: backend 选择改为 multiSelect 多选模式 - 三个独立选项：codex、claude、gemini（每个带详细说明） - 简化任务分类：使用 type 字段（default\|ui\|quick-fix）替代复杂的 complexity 评级 - Backend 路由逻辑清晰：default→codex, ui→gemini, quick-fix→claude - 用户限制优先：仅选 codex 时强制所有任务使用 codex 改进点： - 移除 PR#61 的 complexity/simple/medium/complex 字段 - 移除 rationale 字段，简化为单一 type 维度 - 修正 UI 判定逻辑，改为每任务属性 - Fallback 策略：codex → claude → gemini（优先级清晰） - 错误处理：type 缺失默认为 default 文件修改： - dev-workflow/commands/dev.md: 添加 Step 0，更新路由逻辑 - dev-workflow/agents/dev-plan-generator.md: 简化任务分类 - dev-workflow/README.md: 更新文档和示例 Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-25 22:08:33 +08:00
cexll	683d18e6bb	docs: update troubleshooting with idempotent PATH commands (#95 ) - Use correct PATH pattern matching syntax - Explain installer auto-adds PATH - Provide idempotent command for manual use Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-25 11:40:53 +08:00
cexll	a7147f692c	fix: prevent duplicate PATH entries on reinstall (#95 ) - install.sh: Auto-detect shell and add PATH with idempotency check - install.bat: Improve PATH detection with system PATH check - Fix PATH variable quoting in pattern matching Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-25 11:38:42 +08:00
cexll	b71d74f01f	fix: Minor issues #12 and #13 - ASCII mode and performance optimization This commit addresses the remaining Minor issues from PR #94 code review: Minor #12: Unicode Symbol Compatibility - Added CODEAGENT_ASCII_MODE environment variable support - When set to "true", uses ASCII symbols: PASS/WARN/FAIL - Default behavior (unset or "false"): Unicode symbols ✓/⚠️/✗ - Updated help text to document the environment variable - Added tests for both ASCII and Unicode modes Implementation: - executor.go:514: New getStatusSymbols() function - executor.go:531: Dynamic symbol selection in generateFinalOutputWithMode - main.go:34: useASCIIMode variable declaration - main.go:495: Environment variable documentation in help - executor_concurrent_test.go:292: Tests for ASCII mode - main_integration_test.go:89: Parser updated for both symbol formats Minor #13: Performance Optimization - Reduce Repeated String Operations - Optimized Message parsing to split only once per task result - Added FromLines() variants of all extractor functions - Original extract() functions now wrap FromLines() for compatibility - Reduces memory allocations and CPU usage in parallel execution Implementation: - utils.go:300: extractCoverageFromLines() - utils.go:390: extractFilesChangedFromLines() - utils.go:455: extractTestResultsFromLines() - utils.go:551: extractKeyOutputFromLines() - main.go:255: Single split with reuse: lines := strings.Split(...) Backward Compatibility: - All original extract() functions preserved - Tests updated to handle both symbol formats - No breaking changes to public API Test Results: - All tests pass: go test ./... (40.164s) - ASCII mode verified: PASS/WARN/FAIL symbols display correctly - Unicode mode verified: ✓/⚠️/✗ symbols remain default - Performance: Single split per Message instead of 4+ Usage Examples: # Unicode mode (default) ./codeagent-wrapper --parallel < tasks.txt # ASCII mode (for terminals without Unicode support) CODEAGENT_ASCII_MODE=true ./codeagent-wrapper --parallel < tasks.txt Benefits: - Improved terminal compatibility across different environments - Reduced memory allocations in parallel execution - Better performance for large-scale parallel tasks - User choice between Unicode aesthetics and ASCII compatibility Related: #94 Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-24 11:59:00 +08:00
cexll	af1c860f54	fix: code review fixes for PR #94 - all critical and major issues resolved This commit addresses all Critical and Major issues identified in the code review: Critical Issues Fixed: - #1: Test statistics data loss (utils.go:480) - Changed exit condition from \|\| to && - #2: Below-target header showing "below 0%" - Added defaultCoverageTarget constant Major Issues Fixed: - #3: Coverage extraction not robust - Relaxed trigger conditions for various formats - #4: 0% coverage ignored - Changed from CoverageNum>0 to Coverage!="" check - #5: File change extraction incomplete - Support root files and @ prefix - #6: String truncation panic risk - Added safeTruncate() with rune-based truncation - #7: Breaking change documentation missing - Updated help text and docs - #8: .DS_Store garbage files - Removed files and updated .gitignore - #9: Test coverage insufficient - Added 29+ test cases in utils_test.go - #10: Terminal escape injection risk - Added sanitizeOutput() for ANSI cleaning - #11: Redundant code - Removed unused patterns variable Test Results: - All tests pass: go test ./... (34.283s) - Test coverage: 88.4% (up from ~85%) - New test file: codeagent-wrapper/utils_test.go - No breaking changes to existing functionality Files Modified: - codeagent-wrapper/utils.go (+166 lines) - Core fixes and new functions - codeagent-wrapper/executor.go (+111 lines) - Output format fixes - codeagent-wrapper/main.go (+45 lines) - Configuration updates - codeagent-wrapper/main_test.go (+40 lines) - New integration tests - codeagent-wrapper/utils_test.go (new file) - Complete extractor tests - docs/CODEAGENT-WRAPPER.md (+38 lines) - Documentation updates - .gitignore (+2 lines) - Added .DS_Store patterns - Deleted 5 .DS_Store files Verification: - Binary compiles successfully (v5.4.0) - All extractors validated with real-world test cases - Security vulnerabilities patched - Performance maintained (90% token reduction preserved) Related: #94 Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-24 09:55:39 +08:00
tytsxai	70b1896011	feat(codeagent-wrapper): v5.4.0 structured execution report (#94 ) Merging PR #94 with code review fixes applied. All Critical and Major issues from code review have been addressed: - 11/13 issues fixed (2 minor optimizations deferred) - Test coverage: 88.4% - All tests passing - Security vulnerabilities patched - Documentation updated The code review fixes have been committed to pr-94 branch and are ready for integration.	2025-12-24 09:53:58 +08:00
cexll	3fd3c67749	fix: correct settings.json filename and bump version to v5.2.8 - Fix incorrect filename reference from setting.json to settings.json in backend.go - Update corresponding test fixtures to use correct filename - Bump version from 5.2.7 to 5.2.8 Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-22 10:32:44 +08:00
cexll	156a072a0b	chore: simplify release workflow to use GitHub auto-generated notes - Remove git-cliff dependency and node.js setup - Use generate_release_notes: true for automatic PR/commit listing - Maintains all binary builds and artifact uploads - Release notes can still be manually edited after creation Benefits: - Simpler workflow with fewer dependencies - Automatic PR titles and contributor attribution - Easier to maintain and debug Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-21 20:37:11 +08:00
cexll	0ceb819419	chore: bump version to v5.2.7 Changes in v5.2.7: - Security fix: pass env vars via process environment instead of command line - Prevents ANTHROPIC_API_KEY leakage in ps/logs - Add SetEnv() interface to commandRunner - Type-safe env parsing with 1MB file size limit - Comprehensive test coverage for loadMinimalEnvSettings() Related: #89, PR #92 Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-21 20:25:23 +08:00
ben	4d69c8aef1	fix: allow claude backend to read env from setting.json while preventing recursion (#92 ) * fix: allow claude backend to read env from setting.json while preventing recursion Fixes #89 Problem: - --setting-sources "" prevents claude from reading ~/.claude/setting.json env - Removing it causes infinite recursion via skills/commands/agents loading Solution: - Keep --setting-sources "" to block all config sources - Add loadMinimalEnvSettings() to extract only env from setting.json - Pass env explicitly via --settings parameter - Update tests to validate dynamic --settings parameter Benefits: - Claude backend can access ANTHROPIC_API_KEY and other env vars - Skills/commands/agents remain blocked, preventing recursion - Graceful degradation if setting.json doesn't exist Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai> * security: pass env via process environment instead of command line Critical security fix for issue #89: - Prevents ANTHROPIC_API_KEY leakage in process command line (ps) - Prevents sensitive values from being logged in wrapper logs Changes: 1. executor.go: - Add SetEnv() method to commandRunner interface - realCmd merges env with os.Environ() and sets to cmd.Env - All test mocks implement SetEnv() 2. backend.go: - Change loadMinimalEnvSettings() to return map[string]string - Use os.UserHomeDir() instead of os.Getenv("HOME") - Add 1MB file size limit check - Only accept string values in env (reject non-strings) - Remove --settings parameter (no longer in command line) 3. Tests: - Add loadMinimalEnvSettings() unit tests - Remove --settings validation (no longer in args) - All test mocks implement SetEnv() Security improvements: - No sensitive values in argv (safe from ps/logs) - Type-safe env parsing (string-only) - File size limit prevents memory issues - Graceful degradation if setting.json missing Tests: All pass (30.912s) Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai> --------- Co-authored-by: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-21 20:16:57 +08:00
ben	eec844d850	feat: add millisecond-precision timestamps to all log entries (#91 ) - Add timestamp prefix format [YYYY-MM-DD HH:MM:SS.mmm] to every log entry - Resolves issue where logs lacked time information, making it impossible to determine when events (like "Unknown event format" errors) occurred - Update tests to handle new timestamp format by stripping prefixes during validation - All 27+ tests pass with new format Implementation: - Modified logger.go:369-370 to inject timestamp before message - Updated concurrent_stress_test.go to strip timestamps for format checks Fixes #81 Generated with SWE-Agent.ai Co-authored-by: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-21 18:57:27 +08:00
ben	1f42bcc1c6	fix: comprehensive security and quality improvements for PR #85 & #87 (#90 ) Co-authored-by: tytsxai <tytsxai@users.noreply.github.com>	2025-12-21 18:01:20 +08:00
ben	0f359b048f	Improve backend termination after message and extend timeout (#86 ) * Improve backend termination after message and extend timeout * fix: prevent premature backend termination and revert timeout Critical fixes for executor.go termination logic: 1. Add onComplete callback to prevent premature termination - Parser now distinguishes between "any message" (onMessage) and "terminal event" (onComplete) - Codex: triggers onComplete on thread.completed - Claude: triggers onComplete on type:"result" - Gemini: triggers onComplete on type:"result" + terminal status 2. Fix executor to wait for completion events - Replace messageSeen termination trigger with completeSeen - Only start postMessageTerminateDelay after terminal event - Prevents killing backend before final answer in multi-message scenarios 3. Fix terminated flag synchronization - Only set terminated=true if terminateCommandFn actually succeeds - Prevents "marked as terminated but not actually terminated" state 4. Simplify timer cleanup logic - Unified non-blocking drain on messageTimer.C - Remove dependency on messageTimerCh nil state 5. Revert defaultTimeout from 24h to 2h - 24h (86400s) → 2h (7200s) to avoid operational risks - 12× timeout increase could cause resource exhaustion - Users needing longer tasks can use CODEX_TIMEOUT env var All tests pass. Resolves early termination bug from code review. Co-authored-by: Codeagent (Codex) Generated with SWE-Agent.ai Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai> --------- Co-authored-by: SWE-Agent.ai <noreply@swe-agent.ai>	2025-12-21 15:55:01 +08:00
ben	4e2df6a80e	fix: Parser重复解析优化 + 严重bug修复 + PR #86兼容性 (#88 ) Merging parser optimization with critical bug fixes and PR #86 compatibility. Supersedes #84.	2025-12-21 14:10:40 +08:00
swe-agent[bot]	19facf3385	feat(dev-workflow): Add intelligent backend selection based on task complexity ## Changes ### Core Improvements 1. Flexible Task Count: Remove 2-5 hard limit, use natural functional boundaries (typically 2-8) 2. Complexity-Based Routing: Tasks rated as simple/medium/complex based on functional requirements 3. Intelligent Backend Selection: Orchestrator auto-selects backend based on complexity - Simple/Medium → claude (fast, cost-effective) - Complex → codex (deep reasoning) - UI → gemini (enforced) ### Modified Files - `dev-workflow/agents/dev-plan-generator.md`: - Add complexity field to task template - Add comprehensive complexity assessment guide - Update quality checks to include complexity validation - Remove artificial task count limits - `dev-workflow/commands/dev.md`: - Add backend selection logic in Step 4 - Update task breakdown to include complexity ratings - Add detailed examples for each backend type - Update quality standards - `dev-workflow/README.md`: - Update documentation to reflect intelligent backend selection - Add complexity-based routing explanation - Update examples with complexity ratings ## Architecture - No changes to codeagent-wrapper (all logic in orchestrator) - Backward compatible (existing workflows continue to work) - Complexity evaluation based on functional requirements, NOT code volume ## Benefits - Better resource utilization (use claude for most tasks, codex for complex ones) - Cost optimization (avoid using expensive codex for simple tasks) - Flexibility (no artificial limits on task count) - Clear complexity rationale for each task Generated with swe-agent-bot Co-Authored-By: swe-agent-bot <agent@swe-agent.ai>	2025-12-14 21:57:13 +08:00