Compare commits

..

16 Commits

Author SHA1 Message Date
cexll
61536d04e2 Merge branch 'master' into feat/intelligent-backend-selection
合并 master 分支的最新改动到 PR #61。

冲突解决:
- dev-workflow/commands/dev.md: 保留 multiSelect backend 选择逻辑
- 保留任务类型字段 type: default|ui|quick-fix
- 保留 Backend 路由策略:default→codex, ui→gemini, quick-fix→claude
- 修复 heredoc 示例格式

合并的 master 改动包括:
- codeagent-wrapper v5.4.0 structured execution report (#94)
- 修复 PATH 重复条目问题 (#95)
- ASCII 模式和性能优化
- 其他 bug 修复和文档更新

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-25 22:24:15 +08:00
cexll
2856bf0c29 fix(dev-workflow): refactor backend selection to multiSelect mode
根据 PR review 反馈进行修复:

核心改动:
- Step 0: backend 选择改为 multiSelect 多选模式
- 三个独立选项:codex、claude、gemini(每个带详细说明)
- 简化任务分类:使用 type 字段(default|ui|quick-fix)替代复杂的 complexity 评级
- Backend 路由逻辑清晰:default→codex, ui→gemini, quick-fix→claude
- 用户限制优先:仅选 codex 时强制所有任务使用 codex

改进点:
- 移除 PR#61 的 complexity/simple/medium/complex 字段
- 移除 rationale 字段,简化为单一 type 维度
- 修正 UI 判定逻辑,改为每任务属性
- Fallback 策略:codex → claude → gemini(优先级清晰)
- 错误处理:type 缺失默认为 default

文件修改:
- dev-workflow/commands/dev.md: 添加 Step 0,更新路由逻辑
- dev-workflow/agents/dev-plan-generator.md: 简化任务分类
- dev-workflow/README.md: 更新文档和示例

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-25 22:08:33 +08:00
cexll
683d18e6bb docs: update troubleshooting with idempotent PATH commands (#95)
- Use correct PATH pattern matching syntax
- Explain installer auto-adds PATH
- Provide idempotent command for manual use

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-25 11:40:53 +08:00
cexll
a7147f692c fix: prevent duplicate PATH entries on reinstall (#95)
- install.sh: Auto-detect shell and add PATH with idempotency check
- install.bat: Improve PATH detection with system PATH check
- Fix PATH variable quoting in pattern matching

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-25 11:38:42 +08:00
cexll
b71d74f01f fix: Minor issues #12 and #13 - ASCII mode and performance optimization
This commit addresses the remaining Minor issues from PR #94 code review:

Minor #12: Unicode Symbol Compatibility
- Added CODEAGENT_ASCII_MODE environment variable support
- When set to "true", uses ASCII symbols: PASS/WARN/FAIL
- Default behavior (unset or "false"): Unicode symbols ✓/⚠️/✗
- Updated help text to document the environment variable
- Added tests for both ASCII and Unicode modes

Implementation:
- executor.go:514: New getStatusSymbols() function
- executor.go:531: Dynamic symbol selection in generateFinalOutputWithMode
- main.go:34: useASCIIMode variable declaration
- main.go:495: Environment variable documentation in help
- executor_concurrent_test.go:292: Tests for ASCII mode
- main_integration_test.go:89: Parser updated for both symbol formats

Minor #13: Performance Optimization - Reduce Repeated String Operations
- Optimized Message parsing to split only once per task result
- Added *FromLines() variants of all extractor functions
- Original extract*() functions now wrap *FromLines() for compatibility
- Reduces memory allocations and CPU usage in parallel execution

Implementation:
- utils.go:300: extractCoverageFromLines()
- utils.go:390: extractFilesChangedFromLines()
- utils.go:455: extractTestResultsFromLines()
- utils.go:551: extractKeyOutputFromLines()
- main.go:255: Single split with reuse: lines := strings.Split(...)

Backward Compatibility:
- All original extract*() functions preserved
- Tests updated to handle both symbol formats
- No breaking changes to public API

Test Results:
- All tests pass: go test ./... (40.164s)
- ASCII mode verified: PASS/WARN/FAIL symbols display correctly
- Unicode mode verified: ✓/⚠️/✗ symbols remain default
- Performance: Single split per Message instead of 4+

Usage Examples:
  # Unicode mode (default)
  ./codeagent-wrapper --parallel < tasks.txt

  # ASCII mode (for terminals without Unicode support)
  CODEAGENT_ASCII_MODE=true ./codeagent-wrapper --parallel < tasks.txt

Benefits:
- Improved terminal compatibility across different environments
- Reduced memory allocations in parallel execution
- Better performance for large-scale parallel tasks
- User choice between Unicode aesthetics and ASCII compatibility

Related: #94

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-24 11:59:00 +08:00
cexll
af1c860f54 fix: code review fixes for PR #94 - all critical and major issues resolved
This commit addresses all Critical and Major issues identified in the code review:

Critical Issues Fixed:
- #1: Test statistics data loss (utils.go:480) - Changed exit condition from || to &&
- #2: Below-target header showing "below 0%" - Added defaultCoverageTarget constant

Major Issues Fixed:
- #3: Coverage extraction not robust - Relaxed trigger conditions for various formats
- #4: 0% coverage ignored - Changed from CoverageNum>0 to Coverage!="" check
- #5: File change extraction incomplete - Support root files and @ prefix
- #6: String truncation panic risk - Added safeTruncate() with rune-based truncation
- #7: Breaking change documentation missing - Updated help text and docs
- #8: .DS_Store garbage files - Removed files and updated .gitignore
- #9: Test coverage insufficient - Added 29+ test cases in utils_test.go
- #10: Terminal escape injection risk - Added sanitizeOutput() for ANSI cleaning
- #11: Redundant code - Removed unused patterns variable

Test Results:
- All tests pass: go test ./... (34.283s)
- Test coverage: 88.4% (up from ~85%)
- New test file: codeagent-wrapper/utils_test.go
- No breaking changes to existing functionality

Files Modified:
- codeagent-wrapper/utils.go (+166 lines) - Core fixes and new functions
- codeagent-wrapper/executor.go (+111 lines) - Output format fixes
- codeagent-wrapper/main.go (+45 lines) - Configuration updates
- codeagent-wrapper/main_test.go (+40 lines) - New integration tests
- codeagent-wrapper/utils_test.go (new file) - Complete extractor tests
- docs/CODEAGENT-WRAPPER.md (+38 lines) - Documentation updates
- .gitignore (+2 lines) - Added .DS_Store patterns
- Deleted 5 .DS_Store files

Verification:
- Binary compiles successfully (v5.4.0)
- All extractors validated with real-world test cases
- Security vulnerabilities patched
- Performance maintained (90% token reduction preserved)

Related: #94

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-24 09:55:39 +08:00
tytsxai
70b1896011 feat(codeagent-wrapper): v5.4.0 structured execution report (#94)
Merging PR #94 with code review fixes applied.

All Critical and Major issues from code review have been addressed:
- 11/13 issues fixed (2 minor optimizations deferred)
- Test coverage: 88.4%
- All tests passing
- Security vulnerabilities patched
- Documentation updated

The code review fixes have been committed to pr-94 branch and are ready for integration.
2025-12-24 09:53:58 +08:00
cexll
3fd3c67749 fix: correct settings.json filename and bump version to v5.2.8
- Fix incorrect filename reference from setting.json to settings.json in backend.go
- Update corresponding test fixtures to use correct filename
- Bump version from 5.2.7 to 5.2.8

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-22 10:32:44 +08:00
cexll
156a072a0b chore: simplify release workflow to use GitHub auto-generated notes
- Remove git-cliff dependency and node.js setup
- Use generate_release_notes: true for automatic PR/commit listing
- Maintains all binary builds and artifact uploads
- Release notes can still be manually edited after creation

Benefits:
- Simpler workflow with fewer dependencies
- Automatic PR titles and contributor attribution
- Easier to maintain and debug

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-21 20:37:11 +08:00
cexll
0ceb819419 chore: bump version to v5.2.7
Changes in v5.2.7:
- Security fix: pass env vars via process environment instead of command line
- Prevents ANTHROPIC_API_KEY leakage in ps/logs
- Add SetEnv() interface to commandRunner
- Type-safe env parsing with 1MB file size limit
- Comprehensive test coverage for loadMinimalEnvSettings()

Related: #89, PR #92

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-21 20:25:23 +08:00
ben
4d69c8aef1 fix: allow claude backend to read env from setting.json while preventing recursion (#92)
* fix: allow claude backend to read env from setting.json while preventing recursion

Fixes #89

Problem:
- --setting-sources "" prevents claude from reading ~/.claude/setting.json env
- Removing it causes infinite recursion via skills/commands/agents loading

Solution:
- Keep --setting-sources "" to block all config sources
- Add loadMinimalEnvSettings() to extract only env from setting.json
- Pass env explicitly via --settings parameter
- Update tests to validate dynamic --settings parameter

Benefits:
- Claude backend can access ANTHROPIC_API_KEY and other env vars
- Skills/commands/agents remain blocked, preventing recursion
- Graceful degradation if setting.json doesn't exist

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>

* security: pass env via process environment instead of command line

Critical security fix for issue #89:
- Prevents ANTHROPIC_API_KEY leakage in process command line (ps)
- Prevents sensitive values from being logged in wrapper logs

Changes:
1. executor.go:
   - Add SetEnv() method to commandRunner interface
   - realCmd merges env with os.Environ() and sets to cmd.Env
   - All test mocks implement SetEnv()

2. backend.go:
   - Change loadMinimalEnvSettings() to return map[string]string
   - Use os.UserHomeDir() instead of os.Getenv("HOME")
   - Add 1MB file size limit check
   - Only accept string values in env (reject non-strings)
   - Remove --settings parameter (no longer in command line)

3. Tests:
   - Add loadMinimalEnvSettings() unit tests
   - Remove --settings validation (no longer in args)
   - All test mocks implement SetEnv()

Security improvements:
- No sensitive values in argv (safe from ps/logs)
- Type-safe env parsing (string-only)
- File size limit prevents memory issues
- Graceful degradation if setting.json missing

Tests: All pass (30.912s)

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>

---------

Co-authored-by: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-21 20:16:57 +08:00
ben
eec844d850 feat: add millisecond-precision timestamps to all log entries (#91)
- Add timestamp prefix format [YYYY-MM-DD HH:MM:SS.mmm] to every log entry
- Resolves issue where logs lacked time information, making it impossible to determine when events (like "Unknown event format" errors) occurred
- Update tests to handle new timestamp format by stripping prefixes during validation
- All 27+ tests pass with new format

Implementation:
- Modified logger.go:369-370 to inject timestamp before message
- Updated concurrent_stress_test.go to strip timestamps for format checks

Fixes #81

Generated with SWE-Agent.ai

Co-authored-by: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-21 18:57:27 +08:00
ben
1f42bcc1c6 fix: comprehensive security and quality improvements for PR #85 & #87 (#90)
Co-authored-by: tytsxai <tytsxai@users.noreply.github.com>
2025-12-21 18:01:20 +08:00
ben
0f359b048f Improve backend termination after message and extend timeout (#86)
* Improve backend termination after message and extend timeout

* fix: prevent premature backend termination and revert timeout

Critical fixes for executor.go termination logic:

1. Add onComplete callback to prevent premature termination
   - Parser now distinguishes between "any message" (onMessage) and
     "terminal event" (onComplete)
   - Codex: triggers onComplete on thread.completed
   - Claude: triggers onComplete on type:"result"
   - Gemini: triggers onComplete on type:"result" + terminal status

2. Fix executor to wait for completion events
   - Replace messageSeen termination trigger with completeSeen
   - Only start postMessageTerminateDelay after terminal event
   - Prevents killing backend before final answer in multi-message scenarios

3. Fix terminated flag synchronization
   - Only set terminated=true if terminateCommandFn actually succeeds
   - Prevents "marked as terminated but not actually terminated" state

4. Simplify timer cleanup logic
   - Unified non-blocking drain on messageTimer.C
   - Remove dependency on messageTimerCh nil state

5. Revert defaultTimeout from 24h to 2h
   - 24h (86400s) → 2h (7200s) to avoid operational risks
   - 12× timeout increase could cause resource exhaustion
   - Users needing longer tasks can use CODEX_TIMEOUT env var

All tests pass. Resolves early termination bug from code review.

Co-authored-by: Codeagent (Codex)

Generated with SWE-Agent.ai

Co-Authored-By: SWE-Agent.ai <noreply@swe-agent.ai>

---------

Co-authored-by: SWE-Agent.ai <noreply@swe-agent.ai>
2025-12-21 15:55:01 +08:00
ben
4e2df6a80e fix: Parser重复解析优化 + 严重bug修复 + PR #86兼容性 (#88)
Merging parser optimization with critical bug fixes and PR #86 compatibility. Supersedes #84.
2025-12-21 14:10:40 +08:00
swe-agent[bot]
19facf3385 feat(dev-workflow): Add intelligent backend selection based on task complexity
## Changes

### Core Improvements
1. **Flexible Task Count**: Remove 2-5 hard limit, use natural functional boundaries (typically 2-8)
2. **Complexity-Based Routing**: Tasks rated as simple/medium/complex based on functional requirements
3. **Intelligent Backend Selection**: Orchestrator auto-selects backend based on complexity
   - Simple/Medium → claude (fast, cost-effective)
   - Complex → codex (deep reasoning)
   - UI → gemini (enforced)

### Modified Files
- `dev-workflow/agents/dev-plan-generator.md`:
  - Add complexity field to task template
  - Add comprehensive complexity assessment guide
  - Update quality checks to include complexity validation
  - Remove artificial task count limits

- `dev-workflow/commands/dev.md`:
  - Add backend selection logic in Step 4
  - Update task breakdown to include complexity ratings
  - Add detailed examples for each backend type
  - Update quality standards

- `dev-workflow/README.md`:
  - Update documentation to reflect intelligent backend selection
  - Add complexity-based routing explanation
  - Update examples with complexity ratings

## Architecture
- No changes to codeagent-wrapper (all logic in orchestrator)
- Backward compatible (existing workflows continue to work)
- Complexity evaluation based on functional requirements, NOT code volume

## Benefits
- Better resource utilization (use claude for most tasks, codex for complex ones)
- Cost optimization (avoid using expensive codex for simple tasks)
- Flexibility (no artificial limits on task count)
- Clear complexity rationale for each task

Generated with swe-agent-bot

Co-Authored-By: swe-agent-bot <agent@swe-agent.ai>
2025-12-14 21:57:13 +08:00
27 changed files with 2330 additions and 287 deletions

View File

@@ -97,11 +97,6 @@ jobs:
with: with:
path: artifacts path: artifacts
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Prepare release files - name: Prepare release files
run: | run: |
mkdir -p release mkdir -p release
@@ -109,26 +104,10 @@ jobs:
cp install.sh install.bat release/ cp install.sh install.bat release/
ls -la release/ ls -la release/
- name: Generate release notes with git-cliff
run: |
# Install git-cliff via npx
npx git-cliff@latest --current --strip all -o release_notes.md
# Fallback if generation failed
if [ ! -s release_notes.md ]; then
echo "⚠️ Failed to generate release notes with git-cliff" > release_notes.md
echo "" >> release_notes.md
echo "## What's Changed" >> release_notes.md
echo "See commits in this release for details." >> release_notes.md
fi
echo "--- Generated Release Notes ---"
cat release_notes.md
- name: Create Release - name: Create Release
uses: softprops/action-gh-release@v2 uses: softprops/action-gh-release@v2
with: with:
files: release/* files: release/*
body_path: release_notes.md generate_release_notes: true
draft: false draft: false
prerelease: false prerelease: false

2
.gitignore vendored
View File

@@ -1,5 +1,7 @@
.claude/ .claude/
.claude-trace .claude-trace
.DS_Store
**/.DS_Store
.venv .venv
.pytest_cache .pytest_cache
__pycache__ __pycache__

163
README.md
View File

@@ -132,6 +132,59 @@ Requirements → Architecture → Sprint Plan → Development → Review → QA
--- ---
## Version Requirements
### Codex CLI
**Minimum version:** Check compatibility with your installation
The codeagent-wrapper uses these Codex CLI features:
- `codex e` - Execute commands (shorthand for `codex exec`)
- `--skip-git-repo-check` - Skip git repository validation
- `--json` - JSON stream output format
- `-C <workdir>` - Set working directory
- `resume <session_id>` - Resume previous sessions
**Verify Codex CLI is installed:**
```bash
which codex
codex --version
```
### Claude CLI
**Minimum version:** Check compatibility with your installation
Required features:
- `--output-format stream-json` - Streaming JSON output format
- `--setting-sources` - Control setting sources (prevents infinite recursion)
- `--dangerously-skip-permissions` - Skip permission prompts (use with caution)
- `-p` - Prompt input flag
- `-r <session_id>` - Resume sessions
**Security Note:** The wrapper only adds `--dangerously-skip-permissions` for Claude when explicitly enabled (e.g. `--skip-permissions` / `CODEAGENT_SKIP_PERMISSIONS=true`). Keep it disabled unless you understand the risk.
**Verify Claude CLI is installed:**
```bash
which claude
claude --version
```
### Gemini CLI
**Minimum version:** Check compatibility with your installation
Required features:
- `-o stream-json` - JSON stream output format
- `-y` - Auto-approve prompts (non-interactive mode)
- `-r <session_id>` - Resume sessions
- `-p` - Prompt input flag
**Verify Gemini CLI is installed:**
```bash
which gemini
gemini --version
```
---
## Installation ## Installation
### Modular Installation (Recommended) ### Modular Installation (Recommended)
@@ -163,15 +216,39 @@ python3 install.py --force
``` ```
~/.claude/ ~/.claude/
├── CLAUDE.md # Core instructions and role definition ├── bin/
├── commands/ # Slash commands (/dev, /code, etc.) │ └── codeagent-wrapper # Main executable
├── agents/ # Agent definitions ├── CLAUDE.md # Core instructions and role definition
├── commands/ # Slash commands (/dev, /code, etc.)
├── agents/ # Agent definitions
├── skills/ ├── skills/
│ └── codex/ │ └── codex/
│ └── SKILL.md # Codex integration skill │ └── SKILL.md # Codex integration skill
── installed_modules.json # Installation status ── config.json # Configuration
└── installed_modules.json # Installation status
``` ```
### Customizing Installation Directory
By default, myclaude installs to `~/.claude`. You can customize this using the `INSTALL_DIR` environment variable:
```bash
# Install to custom directory
INSTALL_DIR=/opt/myclaude bash install.sh
# Update your PATH accordingly
export PATH="/opt/myclaude/bin:$PATH"
```
**Directory Structure:**
- `$INSTALL_DIR/bin/` - codeagent-wrapper binary
- `$INSTALL_DIR/skills/` - Skill definitions
- `$INSTALL_DIR/config.json` - Configuration file
- `$INSTALL_DIR/commands/` - Slash command definitions
- `$INSTALL_DIR/agents/` - Agent definitions
**Note:** When using a custom installation directory, ensure that `$INSTALL_DIR/bin` is added to your `PATH` environment variable.
### Configuration ### Configuration
Edit `config.json` to customize: Edit `config.json` to customize:
@@ -294,11 +371,14 @@ setx PATH "%USERPROFILE%\bin;%PATH%"
**Codex wrapper not found:** **Codex wrapper not found:**
```bash ```bash
# Check PATH # Installer auto-adds PATH, check if configured
echo $PATH | grep -q "$HOME/bin" || echo 'export PATH="$HOME/bin:$PATH"' >> ~/.zshrc if [[ ":$PATH:" != *":$HOME/.claude/bin:"* ]]; then
echo "PATH not configured. Reinstalling..."
bash install.sh
fi
# Reinstall # Or manually add (idempotent command)
bash install.sh [[ ":$PATH:" != *":$HOME/.claude/bin:"* ]] && echo 'export PATH="$HOME/.claude/bin:$PATH"' >> ~/.zshrc
``` ```
**Permission denied:** **Permission denied:**
@@ -315,6 +395,71 @@ cat ~/.claude/installed_modules.json
python3 install.py --module dev --force python3 install.py --module dev --force
``` ```
### Version Compatibility Issues
**Backend CLI not found:**
```bash
# Check if backend CLIs are installed
which codex
which claude
which gemini
# Install missing backends
# Codex: Follow installation instructions at https://codex.docs
# Claude: Follow installation instructions at https://claude.ai/docs
# Gemini: Follow installation instructions at https://ai.google.dev/docs
```
**Unsupported CLI flags:**
```bash
# If you see errors like "unknown flag" or "invalid option"
# Check backend CLI version
codex --version
claude --version
gemini --version
# For Codex: Ensure it supports `e`, `--skip-git-repo-check`, `--json`, `-C`, and `resume`
# For Claude: Ensure it supports `--output-format stream-json`, `--setting-sources`, `-r`
# For Gemini: Ensure it supports `-o stream-json`, `-y`, `-r`, `-p`
# Update your backend CLI to the latest version if needed
```
**JSON parsing errors:**
```bash
# If you see "failed to parse JSON output" errors
# Verify the backend outputs stream-json format
codex e --json "test task" # Should output newline-delimited JSON
claude --output-format stream-json -p "test" # Should output stream JSON
# If not, your backend CLI version may be too old or incompatible
```
**Infinite recursion with Claude backend:**
```bash
# The wrapper prevents this with `--setting-sources ""` flag
# If you still see recursion, ensure your Claude CLI supports this flag
claude --help | grep "setting-sources"
# If flag is not supported, upgrade Claude CLI
```
**Session resume failures:**
```bash
# Check if session ID is valid
codex history # List recent sessions
claude history
# Ensure backend CLI supports session resumption
codex resume <session_id> "test" # Should continue from previous session
claude -r <session_id> "test"
# If not supported, use new sessions instead of resume mode
```
--- ---
## Documentation ## Documentation

View File

@@ -152,15 +152,39 @@ python3 install.py --force
``` ```
~/.claude/ ~/.claude/
├── CLAUDE.md # 核心指令和角色定义 ├── bin/
├── commands/ # 斜杠命令 (/dev, /code 等) │ └── codeagent-wrapper # 主可执行文件
├── agents/ # 智能体定义 ├── CLAUDE.md # 核心指令和角色定义
├── commands/ # 斜杠命令 (/dev, /code 等)
├── agents/ # 智能体定义
├── skills/ ├── skills/
│ └── codex/ │ └── codex/
│ └── SKILL.md # Codex 集成技能 │ └── SKILL.md # Codex 集成技能
── installed_modules.json # 安装状态 ── config.json # 配置文件
└── installed_modules.json # 安装状态
``` ```
### 自定义安装目录
默认情况下myclaude 安装到 `~/.claude`。您可以使用 `INSTALL_DIR` 环境变量自定义安装目录:
```bash
# 安装到自定义目录
INSTALL_DIR=/opt/myclaude bash install.sh
# 相应更新您的 PATH
export PATH="/opt/myclaude/bin:$PATH"
```
**目录结构:**
- `$INSTALL_DIR/bin/` - codeagent-wrapper 可执行文件
- `$INSTALL_DIR/skills/` - 技能定义
- `$INSTALL_DIR/config.json` - 配置文件
- `$INSTALL_DIR/commands/` - 斜杠命令定义
- `$INSTALL_DIR/agents/` - 智能体定义
**注意:** 使用自定义安装目录时,请确保将 `$INSTALL_DIR/bin` 添加到您的 `PATH` 环境变量中。
### 配置 ### 配置
编辑 `config.json` 自定义: 编辑 `config.json` 自定义:
@@ -283,11 +307,14 @@ setx PATH "%USERPROFILE%\bin;%PATH%"
**Codex wrapper 未找到:** **Codex wrapper 未找到:**
```bash ```bash
# 检查 PATH # 安装程序会自动添加 PATH检查是否已添加
echo $PATH | grep -q "$HOME/bin" || echo 'export PATH="$HOME/bin:$PATH"' >> ~/.zshrc if [[ ":$PATH:" != *":$HOME/.claude/bin:"* ]]; then
echo "PATH not configured. Reinstalling..."
bash install.sh
fi
# 重新安装 # 或手动添加(幂等性命令)
bash install.sh [[ ":$PATH:" != *":$HOME/.claude/bin:"* ]] && echo 'export PATH="$HOME/.claude/bin:$PATH"' >> ~/.zshrc
``` ```
**权限被拒绝:** **权限被拒绝:**

View File

@@ -1,5 +1,11 @@
package main package main
import (
"encoding/json"
"os"
"path/filepath"
)
// Backend defines the contract for invoking different AI CLI backends. // Backend defines the contract for invoking different AI CLI backends.
// Each backend is responsible for supplying the executable command and // Each backend is responsible for supplying the executable command and
// building the argument list based on the wrapper config. // building the argument list based on the wrapper config.
@@ -26,15 +32,62 @@ func (ClaudeBackend) Command() string {
return "claude" return "claude"
} }
func (ClaudeBackend) BuildArgs(cfg *Config, targetArg string) []string { func (ClaudeBackend) BuildArgs(cfg *Config, targetArg string) []string {
return buildClaudeArgs(cfg, targetArg)
}
const maxClaudeSettingsBytes = 1 << 20 // 1MB
// loadMinimalEnvSettings 从 ~/.claude/settings.json 只提取 env 配置。
// 只接受字符串类型的值;文件缺失/解析失败/超限都返回空。
func loadMinimalEnvSettings() map[string]string {
home, err := os.UserHomeDir()
if err != nil || home == "" {
return nil
}
settingPath := filepath.Join(home, ".claude", "settings.json")
info, err := os.Stat(settingPath)
if err != nil || info.Size() > maxClaudeSettingsBytes {
return nil
}
data, err := os.ReadFile(settingPath)
if err != nil {
return nil
}
var cfg struct {
Env map[string]any `json:"env"`
}
if err := json.Unmarshal(data, &cfg); err != nil {
return nil
}
if len(cfg.Env) == 0 {
return nil
}
env := make(map[string]string, len(cfg.Env))
for k, v := range cfg.Env {
s, ok := v.(string)
if !ok {
continue
}
env[k] = s
}
if len(env) == 0 {
return nil
}
return env
}
func buildClaudeArgs(cfg *Config, targetArg string) []string {
if cfg == nil { if cfg == nil {
return nil return nil
} }
args := []string{"-p", "--dangerously-skip-permissions"} args := []string{"-p"}
if cfg.SkipPermissions {
// Only skip permissions when explicitly requested args = append(args, "--dangerously-skip-permissions")
// if cfg.SkipPermissions { }
// args = append(args, "--dangerously-skip-permissions")
// }
// Prevent infinite recursion: disable all setting sources (user, project, local) // Prevent infinite recursion: disable all setting sources (user, project, local)
// This ensures a clean execution environment without CLAUDE.md or skills that would trigger codeagent // This ensures a clean execution environment without CLAUDE.md or skills that would trigger codeagent
@@ -60,6 +113,10 @@ func (GeminiBackend) Command() string {
return "gemini" return "gemini"
} }
func (GeminiBackend) BuildArgs(cfg *Config, targetArg string) []string { func (GeminiBackend) BuildArgs(cfg *Config, targetArg string) []string {
return buildGeminiArgs(cfg, targetArg)
}
func buildGeminiArgs(cfg *Config, targetArg string) []string {
if cfg == nil { if cfg == nil {
return nil return nil
} }

View File

@@ -1,6 +1,9 @@
package main package main
import ( import (
"bytes"
"os"
"path/filepath"
"reflect" "reflect"
"testing" "testing"
) )
@@ -8,16 +11,16 @@ import (
func TestClaudeBuildArgs_ModesAndPermissions(t *testing.T) { func TestClaudeBuildArgs_ModesAndPermissions(t *testing.T) {
backend := ClaudeBackend{} backend := ClaudeBackend{}
t.Run("new mode uses workdir without skip by default", func(t *testing.T) { t.Run("new mode omits skip-permissions by default", func(t *testing.T) {
cfg := &Config{Mode: "new", WorkDir: "/repo"} cfg := &Config{Mode: "new", WorkDir: "/repo"}
got := backend.BuildArgs(cfg, "todo") got := backend.BuildArgs(cfg, "todo")
want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"} want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"}
if !reflect.DeepEqual(got, want) { if !reflect.DeepEqual(got, want) {
t.Fatalf("got %v, want %v", got, want) t.Fatalf("got %v, want %v", got, want)
} }
}) })
t.Run("new mode opt-in skip permissions with default workdir", func(t *testing.T) { t.Run("new mode can opt-in skip-permissions", func(t *testing.T) {
cfg := &Config{Mode: "new", SkipPermissions: true} cfg := &Config{Mode: "new", SkipPermissions: true}
got := backend.BuildArgs(cfg, "-") got := backend.BuildArgs(cfg, "-")
want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "-"} want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "-"}
@@ -26,10 +29,10 @@ func TestClaudeBuildArgs_ModesAndPermissions(t *testing.T) {
} }
}) })
t.Run("resume mode uses session id and omits workdir", func(t *testing.T) { t.Run("resume mode includes session id", func(t *testing.T) {
cfg := &Config{Mode: "resume", SessionID: "sid-123", WorkDir: "/ignored"} cfg := &Config{Mode: "resume", SessionID: "sid-123", WorkDir: "/ignored"}
got := backend.BuildArgs(cfg, "resume-task") got := backend.BuildArgs(cfg, "resume-task")
want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "-r", "sid-123", "--output-format", "stream-json", "--verbose", "resume-task"} want := []string{"-p", "--setting-sources", "", "-r", "sid-123", "--output-format", "stream-json", "--verbose", "resume-task"}
if !reflect.DeepEqual(got, want) { if !reflect.DeepEqual(got, want) {
t.Fatalf("got %v, want %v", got, want) t.Fatalf("got %v, want %v", got, want)
} }
@@ -38,7 +41,16 @@ func TestClaudeBuildArgs_ModesAndPermissions(t *testing.T) {
t.Run("resume mode without session still returns base flags", func(t *testing.T) { t.Run("resume mode without session still returns base flags", func(t *testing.T) {
cfg := &Config{Mode: "resume", WorkDir: "/ignored"} cfg := &Config{Mode: "resume", WorkDir: "/ignored"}
got := backend.BuildArgs(cfg, "follow-up") got := backend.BuildArgs(cfg, "follow-up")
want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "follow-up"} want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "follow-up"}
if !reflect.DeepEqual(got, want) {
t.Fatalf("got %v, want %v", got, want)
}
})
t.Run("resume mode can opt-in skip permissions", func(t *testing.T) {
cfg := &Config{Mode: "resume", SessionID: "sid-123", SkipPermissions: true}
got := backend.BuildArgs(cfg, "resume-task")
want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "-r", "sid-123", "--output-format", "stream-json", "--verbose", "resume-task"}
if !reflect.DeepEqual(got, want) { if !reflect.DeepEqual(got, want) {
t.Fatalf("got %v, want %v", got, want) t.Fatalf("got %v, want %v", got, want)
} }
@@ -89,7 +101,11 @@ func TestClaudeBuildArgs_GeminiAndCodexModes(t *testing.T) {
} }
}) })
t.Run("codex build args passthrough remains intact", func(t *testing.T) { t.Run("codex build args omits bypass flag by default", func(t *testing.T) {
const key = "CODEX_BYPASS_SANDBOX"
t.Cleanup(func() { os.Unsetenv(key) })
os.Unsetenv(key)
backend := CodexBackend{} backend := CodexBackend{}
cfg := &Config{Mode: "new", WorkDir: "/tmp"} cfg := &Config{Mode: "new", WorkDir: "/tmp"}
got := backend.BuildArgs(cfg, "task") got := backend.BuildArgs(cfg, "task")
@@ -98,6 +114,20 @@ func TestClaudeBuildArgs_GeminiAndCodexModes(t *testing.T) {
t.Fatalf("got %v, want %v", got, want) t.Fatalf("got %v, want %v", got, want)
} }
}) })
t.Run("codex build args includes bypass flag when enabled", func(t *testing.T) {
const key = "CODEX_BYPASS_SANDBOX"
t.Cleanup(func() { os.Unsetenv(key) })
os.Setenv(key, "true")
backend := CodexBackend{}
cfg := &Config{Mode: "new", WorkDir: "/tmp"}
got := backend.BuildArgs(cfg, "task")
want := []string{"e", "--dangerously-bypass-approvals-and-sandbox", "--skip-git-repo-check", "-C", "/tmp", "--json", "task"}
if !reflect.DeepEqual(got, want) {
t.Fatalf("got %v, want %v", got, want)
}
})
} }
func TestClaudeBuildArgs_BackendMetadata(t *testing.T) { func TestClaudeBuildArgs_BackendMetadata(t *testing.T) {
@@ -120,3 +150,64 @@ func TestClaudeBuildArgs_BackendMetadata(t *testing.T) {
} }
} }
} }
func TestLoadMinimalEnvSettings(t *testing.T) {
home := t.TempDir()
t.Setenv("HOME", home)
t.Setenv("USERPROFILE", home)
t.Run("missing file returns empty", func(t *testing.T) {
if got := loadMinimalEnvSettings(); len(got) != 0 {
t.Fatalf("got %v, want empty", got)
}
})
t.Run("valid env returns string map", func(t *testing.T) {
dir := filepath.Join(home, ".claude")
if err := os.MkdirAll(dir, 0o755); err != nil {
t.Fatalf("MkdirAll: %v", err)
}
path := filepath.Join(dir, "settings.json")
data := []byte(`{"env":{"ANTHROPIC_API_KEY":"secret","FOO":"bar"}}`)
if err := os.WriteFile(path, data, 0o600); err != nil {
t.Fatalf("WriteFile: %v", err)
}
got := loadMinimalEnvSettings()
if got["ANTHROPIC_API_KEY"] != "secret" || got["FOO"] != "bar" {
t.Fatalf("got %v, want keys present", got)
}
})
t.Run("non-string values are ignored", func(t *testing.T) {
dir := filepath.Join(home, ".claude")
path := filepath.Join(dir, "settings.json")
data := []byte(`{"env":{"GOOD":"ok","BAD":123,"ALSO_BAD":true}}`)
if err := os.WriteFile(path, data, 0o600); err != nil {
t.Fatalf("WriteFile: %v", err)
}
got := loadMinimalEnvSettings()
if got["GOOD"] != "ok" {
t.Fatalf("got %v, want GOOD=ok", got)
}
if _, ok := got["BAD"]; ok {
t.Fatalf("got %v, want BAD omitted", got)
}
if _, ok := got["ALSO_BAD"]; ok {
t.Fatalf("got %v, want ALSO_BAD omitted", got)
}
})
t.Run("oversized file returns empty", func(t *testing.T) {
dir := filepath.Join(home, ".claude")
path := filepath.Join(dir, "settings.json")
data := bytes.Repeat([]byte("a"), maxClaudeSettingsBytes+1)
if err := os.WriteFile(path, data, 0o600); err != nil {
t.Fatalf("WriteFile: %v", err)
}
if got := loadMinimalEnvSettings(); len(got) != 0 {
t.Fatalf("got %v, want empty", got)
}
})
}

View File

@@ -13,6 +13,16 @@ import (
"time" "time"
) )
func stripTimestampPrefix(line string) string {
if !strings.HasPrefix(line, "[") {
return line
}
if idx := strings.Index(line, "] "); idx >= 0 {
return line[idx+2:]
}
return line
}
// TestConcurrentStressLogger 高并发压力测试 // TestConcurrentStressLogger 高并发压力测试
func TestConcurrentStressLogger(t *testing.T) { func TestConcurrentStressLogger(t *testing.T) {
if testing.Short() { if testing.Short() {
@@ -79,7 +89,8 @@ func TestConcurrentStressLogger(t *testing.T) {
// 验证日志格式(纯文本,无前缀) // 验证日志格式(纯文本,无前缀)
formatRE := regexp.MustCompile(`^goroutine-\d+-msg-\d+$`) formatRE := regexp.MustCompile(`^goroutine-\d+-msg-\d+$`)
for i, line := range lines[:min(10, len(lines))] { for i, line := range lines[:min(10, len(lines))] {
if !formatRE.MatchString(line) { msg := stripTimestampPrefix(line)
if !formatRE.MatchString(msg) {
t.Errorf("line %d has invalid format: %s", i, line) t.Errorf("line %d has invalid format: %s", i, line)
} }
} }
@@ -291,7 +302,7 @@ func TestLoggerOrderPreservation(t *testing.T) {
sequences := make(map[int][]int) // goroutine ID -> sequence numbers sequences := make(map[int][]int) // goroutine ID -> sequence numbers
for scanner.Scan() { for scanner.Scan() {
line := scanner.Text() line := stripTimestampPrefix(scanner.Text())
var gid, seq int var gid, seq int
// Parse format: G0-SEQ0001 (without INFO: prefix) // Parse format: G0-SEQ0001 (without INFO: prefix)
_, err := fmt.Sscanf(line, "G%d-SEQ%04d", &gid, &seq) _, err := fmt.Sscanf(line, "G%d-SEQ%04d", &gid, &seq)

View File

@@ -49,7 +49,15 @@ type TaskResult struct {
SessionID string `json:"session_id"` SessionID string `json:"session_id"`
Error string `json:"error"` Error string `json:"error"`
LogPath string `json:"log_path"` LogPath string `json:"log_path"`
sharedLog bool // Structured report fields
Coverage string `json:"coverage,omitempty"` // extracted coverage percentage (e.g., "92%")
CoverageNum float64 `json:"coverage_num,omitempty"` // numeric coverage for comparison
CoverageTarget float64 `json:"coverage_target,omitempty"` // target coverage (default 90)
FilesChanged []string `json:"files_changed,omitempty"` // list of changed files
KeyOutput string `json:"key_output,omitempty"` // brief summary of what was done
TestsPassed int `json:"tests_passed,omitempty"` // number of tests passed
TestsFailed int `json:"tests_failed,omitempty"` // number of tests failed
sharedLog bool
} }
var backendRegistry = map[string]Backend{ var backendRegistry = map[string]Backend{
@@ -164,6 +172,9 @@ func parseParallelConfig(data []byte) (*ParallelConfig, error) {
if content == "" { if content == "" {
return nil, fmt.Errorf("task block #%d (%q) missing content", taskIndex, task.ID) return nil, fmt.Errorf("task block #%d (%q) missing content", taskIndex, task.ID)
} }
if task.Mode == "resume" && strings.TrimSpace(task.SessionID) == "" {
return nil, fmt.Errorf("task block #%d (%q) has empty session_id", taskIndex, task.ID)
}
if _, exists := seen[task.ID]; exists { if _, exists := seen[task.ID]; exists {
return nil, fmt.Errorf("task block #%d has duplicate id: %s", taskIndex, task.ID) return nil, fmt.Errorf("task block #%d has duplicate id: %s", taskIndex, task.ID)
} }
@@ -232,7 +243,10 @@ func parseArgs() (*Config, error) {
return nil, fmt.Errorf("resume mode requires: resume <session_id> <task>") return nil, fmt.Errorf("resume mode requires: resume <session_id> <task>")
} }
cfg.Mode = "resume" cfg.Mode = "resume"
cfg.SessionID = args[1] cfg.SessionID = strings.TrimSpace(args[1])
if cfg.SessionID == "" {
return nil, fmt.Errorf("resume mode requires non-empty session_id")
}
cfg.Task = args[2] cfg.Task = args[2]
cfg.ExplicitStdin = (args[2] == "-") cfg.ExplicitStdin = (args[2] == "-")
if len(args) > 3 { if len(args) > 3 {

View File

@@ -16,6 +16,8 @@ import (
"time" "time"
) )
const postMessageTerminateDelay = 1 * time.Second
// commandRunner abstracts exec.Cmd for testability // commandRunner abstracts exec.Cmd for testability
type commandRunner interface { type commandRunner interface {
Start() error Start() error
@@ -24,6 +26,7 @@ type commandRunner interface {
StdinPipe() (io.WriteCloser, error) StdinPipe() (io.WriteCloser, error)
SetStderr(io.Writer) SetStderr(io.Writer)
SetDir(string) SetDir(string)
SetEnv(env map[string]string)
Process() processHandle Process() processHandle
} }
@@ -79,6 +82,52 @@ func (r *realCmd) SetDir(dir string) {
} }
} }
func (r *realCmd) SetEnv(env map[string]string) {
if r == nil || r.cmd == nil || len(env) == 0 {
return
}
merged := make(map[string]string, len(env)+len(os.Environ()))
for _, kv := range os.Environ() {
if kv == "" {
continue
}
idx := strings.IndexByte(kv, '=')
if idx <= 0 {
continue
}
merged[kv[:idx]] = kv[idx+1:]
}
for _, kv := range r.cmd.Env {
if kv == "" {
continue
}
idx := strings.IndexByte(kv, '=')
if idx <= 0 {
continue
}
merged[kv[:idx]] = kv[idx+1:]
}
for k, v := range env {
if strings.TrimSpace(k) == "" {
continue
}
merged[k] = v
}
keys := make([]string, 0, len(merged))
for k := range merged {
keys = append(keys, k)
}
sort.Strings(keys)
out := make([]string, 0, len(keys))
for _, k := range keys {
out = append(out, k+"="+merged[k])
}
r.cmd.Env = out
}
func (r *realCmd) Process() processHandle { func (r *realCmd) Process() processHandle {
if r == nil || r.cmd == nil || r.cmd.Process == nil { if r == nil || r.cmd == nil || r.cmd.Process == nil {
return nil return nil
@@ -462,68 +511,255 @@ func shouldSkipTask(task TaskSpec, failed map[string]TaskResult) (bool, string)
return true, fmt.Sprintf("skipped due to failed dependencies: %s", strings.Join(blocked, ",")) return true, fmt.Sprintf("skipped due to failed dependencies: %s", strings.Join(blocked, ","))
} }
func generateFinalOutput(results []TaskResult) string { // getStatusSymbols returns status symbols based on ASCII mode.
var sb strings.Builder func getStatusSymbols() (success, warning, failed string) {
if os.Getenv("CODEAGENT_ASCII_MODE") == "true" {
return "PASS", "WARN", "FAIL"
}
return "✓", "⚠️", "✗"
}
func generateFinalOutput(results []TaskResult) string {
return generateFinalOutputWithMode(results, true) // default to summary mode
}
// generateFinalOutputWithMode generates output based on mode
// summaryOnly=true: structured report - every token has value
// summaryOnly=false: full output with complete messages (legacy behavior)
func generateFinalOutputWithMode(results []TaskResult, summaryOnly bool) string {
var sb strings.Builder
successSymbol, warningSymbol, failedSymbol := getStatusSymbols()
reportCoverageTarget := defaultCoverageTarget
for _, res := range results {
if res.CoverageTarget > 0 {
reportCoverageTarget = res.CoverageTarget
break
}
}
// Count results by status
success := 0 success := 0
failed := 0 failed := 0
belowTarget := 0
for _, res := range results { for _, res := range results {
if res.ExitCode == 0 && res.Error == "" { if res.ExitCode == 0 && res.Error == "" {
success++ success++
target := res.CoverageTarget
if target <= 0 {
target = reportCoverageTarget
}
if res.Coverage != "" && target > 0 && res.CoverageNum < target {
belowTarget++
}
} else { } else {
failed++ failed++
} }
} }
sb.WriteString(fmt.Sprintf("=== Parallel Execution Summary ===\n")) if summaryOnly {
sb.WriteString(fmt.Sprintf("Total: %d | Success: %d | Failed: %d\n\n", len(results), success, failed)) // Header
sb.WriteString("=== Execution Report ===\n")
sb.WriteString(fmt.Sprintf("%d tasks | %d passed | %d failed", len(results), success, failed))
if belowTarget > 0 {
sb.WriteString(fmt.Sprintf(" | %d below %.0f%%", belowTarget, reportCoverageTarget))
}
sb.WriteString("\n\n")
// Task Results - each task gets: Did + Files + Tests + Coverage
sb.WriteString("## Task Results\n")
for _, res := range results {
taskID := sanitizeOutput(res.TaskID)
coverage := sanitizeOutput(res.Coverage)
keyOutput := sanitizeOutput(res.KeyOutput)
logPath := sanitizeOutput(res.LogPath)
filesChanged := sanitizeOutput(strings.Join(res.FilesChanged, ", "))
target := res.CoverageTarget
if target <= 0 {
target = reportCoverageTarget
}
isSuccess := res.ExitCode == 0 && res.Error == ""
isBelowTarget := isSuccess && coverage != "" && target > 0 && res.CoverageNum < target
if isSuccess && !isBelowTarget {
// Passed task: one block with Did/Files/Tests
sb.WriteString(fmt.Sprintf("\n### %s %s", taskID, successSymbol))
if coverage != "" {
sb.WriteString(fmt.Sprintf(" %s", coverage))
}
sb.WriteString("\n")
if keyOutput != "" {
sb.WriteString(fmt.Sprintf("Did: %s\n", keyOutput))
}
if len(res.FilesChanged) > 0 {
sb.WriteString(fmt.Sprintf("Files: %s\n", filesChanged))
}
if res.TestsPassed > 0 {
sb.WriteString(fmt.Sprintf("Tests: %d passed\n", res.TestsPassed))
}
if logPath != "" {
sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
}
} else if isSuccess && isBelowTarget {
// Below target: add Gap info
sb.WriteString(fmt.Sprintf("\n### %s %s %s (below %.0f%%)\n", taskID, warningSymbol, coverage, target))
if keyOutput != "" {
sb.WriteString(fmt.Sprintf("Did: %s\n", keyOutput))
}
if len(res.FilesChanged) > 0 {
sb.WriteString(fmt.Sprintf("Files: %s\n", filesChanged))
}
if res.TestsPassed > 0 {
sb.WriteString(fmt.Sprintf("Tests: %d passed\n", res.TestsPassed))
}
// Extract what's missing from coverage
gap := sanitizeOutput(extractCoverageGap(res.Message))
if gap != "" {
sb.WriteString(fmt.Sprintf("Gap: %s\n", gap))
}
if logPath != "" {
sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
}
for _, res := range results {
sb.WriteString(fmt.Sprintf("--- Task: %s ---\n", res.TaskID))
if res.Error != "" {
sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\nError: %s\n", res.ExitCode, res.Error))
} else if res.ExitCode != 0 {
sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\n", res.ExitCode))
} else {
sb.WriteString("Status: SUCCESS\n")
}
if res.SessionID != "" {
sb.WriteString(fmt.Sprintf("Session: %s\n", res.SessionID))
}
if res.LogPath != "" {
if res.sharedLog {
sb.WriteString(fmt.Sprintf("Log: %s (shared)\n", res.LogPath))
} else { } else {
sb.WriteString(fmt.Sprintf("Log: %s\n", res.LogPath)) // Failed task: show error detail
sb.WriteString(fmt.Sprintf("\n### %s %s FAILED\n", taskID, failedSymbol))
sb.WriteString(fmt.Sprintf("Exit code: %d\n", res.ExitCode))
if errText := sanitizeOutput(res.Error); errText != "" {
sb.WriteString(fmt.Sprintf("Error: %s\n", errText))
}
// Show context from output (last meaningful lines)
detail := sanitizeOutput(extractErrorDetail(res.Message, 300))
if detail != "" {
sb.WriteString(fmt.Sprintf("Detail: %s\n", detail))
}
if logPath != "" {
sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
}
} }
} }
if res.Message != "" {
sb.WriteString(fmt.Sprintf("\n%s\n", res.Message)) // Summary section
sb.WriteString("\n## Summary\n")
sb.WriteString(fmt.Sprintf("- %d/%d completed successfully\n", success, len(results)))
if belowTarget > 0 || failed > 0 {
var needFix []string
var needCoverage []string
for _, res := range results {
if res.ExitCode != 0 || res.Error != "" {
taskID := sanitizeOutput(res.TaskID)
reason := sanitizeOutput(res.Error)
if reason == "" && res.ExitCode != 0 {
reason = fmt.Sprintf("exit code %d", res.ExitCode)
}
reason = safeTruncate(reason, 50)
needFix = append(needFix, fmt.Sprintf("%s (%s)", taskID, reason))
continue
}
target := res.CoverageTarget
if target <= 0 {
target = reportCoverageTarget
}
if res.Coverage != "" && target > 0 && res.CoverageNum < target {
needCoverage = append(needCoverage, sanitizeOutput(res.TaskID))
}
}
if len(needFix) > 0 {
sb.WriteString(fmt.Sprintf("- Fix: %s\n", strings.Join(needFix, ", ")))
}
if len(needCoverage) > 0 {
sb.WriteString(fmt.Sprintf("- Coverage: %s\n", strings.Join(needCoverage, ", ")))
}
}
} else {
// Legacy full output mode
sb.WriteString("=== Parallel Execution Summary ===\n")
sb.WriteString(fmt.Sprintf("Total: %d | Success: %d | Failed: %d\n\n", len(results), success, failed))
for _, res := range results {
taskID := sanitizeOutput(res.TaskID)
sb.WriteString(fmt.Sprintf("--- Task: %s ---\n", taskID))
if res.Error != "" {
sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\nError: %s\n", res.ExitCode, sanitizeOutput(res.Error)))
} else if res.ExitCode != 0 {
sb.WriteString(fmt.Sprintf("Status: FAILED (exit code %d)\n", res.ExitCode))
} else {
sb.WriteString("Status: SUCCESS\n")
}
if res.Coverage != "" {
sb.WriteString(fmt.Sprintf("Coverage: %s\n", sanitizeOutput(res.Coverage)))
}
if res.SessionID != "" {
sb.WriteString(fmt.Sprintf("Session: %s\n", sanitizeOutput(res.SessionID)))
}
if res.LogPath != "" {
logPath := sanitizeOutput(res.LogPath)
if res.sharedLog {
sb.WriteString(fmt.Sprintf("Log: %s (shared)\n", logPath))
} else {
sb.WriteString(fmt.Sprintf("Log: %s\n", logPath))
}
}
if res.Message != "" {
message := sanitizeOutput(res.Message)
if message != "" {
sb.WriteString(fmt.Sprintf("\n%s\n", message))
}
}
sb.WriteString("\n")
} }
sb.WriteString("\n")
} }
return sb.String() return sb.String()
} }
func buildCodexArgs(cfg *Config, targetArg string) []string { func buildCodexArgs(cfg *Config, targetArg string) []string {
if cfg.Mode == "resume" { if cfg == nil {
return []string{ panic("buildCodexArgs: nil config")
"e", }
"--skip-git-repo-check",
"--json", var resumeSessionID string
"resume", isResume := cfg.Mode == "resume"
cfg.SessionID, if isResume {
targetArg, resumeSessionID = strings.TrimSpace(cfg.SessionID)
if resumeSessionID == "" {
logError("invalid config: resume mode requires non-empty session_id")
isResume = false
} }
} }
return []string{
"e", args := []string{"e"}
"--skip-git-repo-check",
if envFlagEnabled("CODEX_BYPASS_SANDBOX") {
logWarn("CODEX_BYPASS_SANDBOX=true: running without approval/sandbox protection")
args = append(args, "--dangerously-bypass-approvals-and-sandbox")
}
args = append(args, "--skip-git-repo-check")
if isResume {
return append(args,
"--json",
"resume",
resumeSessionID,
targetArg,
)
}
return append(args,
"-C", cfg.WorkDir, "-C", cfg.WorkDir,
"--json", "--json",
targetArg, targetArg,
} )
} }
func runCodexTask(taskSpec TaskSpec, silent bool, timeoutSec int) TaskResult { func runCodexTask(taskSpec TaskSpec, silent bool, timeoutSec int) TaskResult {
@@ -574,6 +810,12 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
cfg.WorkDir = defaultWorkdir cfg.WorkDir = defaultWorkdir
} }
if cfg.Mode == "resume" && strings.TrimSpace(cfg.SessionID) == "" {
result.ExitCode = 1
result.Error = "resume mode requires non-empty session_id"
return result
}
useStdin := taskSpec.UseStdin useStdin := taskSpec.UseStdin
targetArg := taskSpec.Task targetArg := taskSpec.Task
if useStdin { if useStdin {
@@ -673,6 +915,12 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
cmd := newCommandRunner(ctx, commandName, codexArgs...) cmd := newCommandRunner(ctx, commandName, codexArgs...)
if cfg.Backend == "claude" {
if env := loadMinimalEnvSettings(); len(env) > 0 {
cmd.SetEnv(env)
}
}
// For backends that don't support -C flag (claude, gemini), set working directory via cmd.Dir // For backends that don't support -C flag (claude, gemini), set working directory via cmd.Dir
// Codex passes workdir via -C flag, so we skip setting Dir for it to avoid conflicts // Codex passes workdir via -C flag, so we skip setting Dir for it to avoid conflicts
if cfg.Mode != "resume" && commandName != "codex" && cfg.WorkDir != "" { if cfg.Mode != "resume" && commandName != "codex" && cfg.WorkDir != "" {
@@ -729,6 +977,7 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
// Start parse goroutine BEFORE starting the command to avoid race condition // Start parse goroutine BEFORE starting the command to avoid race condition
// where fast-completing commands close stdout before parser starts reading // where fast-completing commands close stdout before parser starts reading
messageSeen := make(chan struct{}, 1) messageSeen := make(chan struct{}, 1)
completeSeen := make(chan struct{}, 1)
parseCh := make(chan parseResult, 1) parseCh := make(chan parseResult, 1)
go func() { go func() {
msg, tid := parseJSONStreamInternal(stdoutReader, logWarnFn, logInfoFn, func() { msg, tid := parseJSONStreamInternal(stdoutReader, logWarnFn, logInfoFn, func() {
@@ -736,7 +985,16 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
case messageSeen <- struct{}{}: case messageSeen <- struct{}{}:
default: default:
} }
}, func() {
select {
case completeSeen <- struct{}{}:
default:
}
}) })
select {
case completeSeen <- struct{}{}:
default:
}
parseCh <- parseResult{message: msg, threadID: tid} parseCh <- parseResult{message: msg, threadID: tid}
}() }()
@@ -773,17 +1031,63 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
waitCh := make(chan error, 1) waitCh := make(chan error, 1)
go func() { waitCh <- cmd.Wait() }() go func() { waitCh <- cmd.Wait() }()
var waitErr error var (
var forceKillTimer *forceKillTimer waitErr error
var ctxCancelled bool forceKillTimer *forceKillTimer
ctxCancelled bool
messageTimer *time.Timer
messageTimerCh <-chan time.Time
forcedAfterComplete bool
terminated bool
messageSeenObserved bool
completeSeenObserved bool
)
select { waitLoop:
case waitErr = <-waitCh: for {
case <-ctx.Done(): select {
ctxCancelled = true case waitErr = <-waitCh:
logErrorFn(cancelReason(commandName, ctx)) break waitLoop
forceKillTimer = terminateCommandFn(cmd) case <-ctx.Done():
waitErr = <-waitCh ctxCancelled = true
logErrorFn(cancelReason(commandName, ctx))
if !terminated {
if timer := terminateCommandFn(cmd); timer != nil {
forceKillTimer = timer
terminated = true
}
}
waitErr = <-waitCh
break waitLoop
case <-messageTimerCh:
forcedAfterComplete = true
messageTimerCh = nil
if !terminated {
logWarnFn(fmt.Sprintf("%s output parsed; terminating lingering backend", commandName))
if timer := terminateCommandFn(cmd); timer != nil {
forceKillTimer = timer
terminated = true
}
}
case <-completeSeen:
completeSeenObserved = true
if messageTimer != nil {
continue
}
messageTimer = time.NewTimer(postMessageTerminateDelay)
messageTimerCh = messageTimer.C
case <-messageSeen:
messageSeenObserved = true
}
}
if messageTimer != nil {
if !messageTimer.Stop() {
select {
case <-messageTimer.C:
default:
}
}
} }
if forceKillTimer != nil { if forceKillTimer != nil {
@@ -791,10 +1095,14 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
} }
var parsed parseResult var parsed parseResult
if ctxCancelled { switch {
case ctxCancelled:
closeWithReason(stdout, stdoutCloseReasonCtx) closeWithReason(stdout, stdoutCloseReasonCtx)
parsed = <-parseCh parsed = <-parseCh
} else { case messageSeenObserved || completeSeenObserved:
closeWithReason(stdout, stdoutCloseReasonWait)
parsed = <-parseCh
default:
drainTimer := time.NewTimer(stdoutDrainTimeout) drainTimer := time.NewTimer(stdoutDrainTimeout)
defer drainTimer.Stop() defer drainTimer.Stop()
@@ -802,6 +1110,11 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
case parsed = <-parseCh: case parsed = <-parseCh:
closeWithReason(stdout, stdoutCloseReasonWait) closeWithReason(stdout, stdoutCloseReasonWait)
case <-messageSeen: case <-messageSeen:
messageSeenObserved = true
closeWithReason(stdout, stdoutCloseReasonWait)
parsed = <-parseCh
case <-completeSeen:
completeSeenObserved = true
closeWithReason(stdout, stdoutCloseReasonWait) closeWithReason(stdout, stdoutCloseReasonWait)
parsed = <-parseCh parsed = <-parseCh
case <-drainTimer.C: case <-drainTimer.C:
@@ -822,17 +1135,21 @@ func runCodexTaskWithContext(parentCtx context.Context, taskSpec TaskSpec, backe
} }
if waitErr != nil { if waitErr != nil {
if exitErr, ok := waitErr.(*exec.ExitError); ok { if forcedAfterComplete && parsed.message != "" {
code := exitErr.ExitCode() logWarnFn(fmt.Sprintf("%s terminated after delivering output", commandName))
logErrorFn(fmt.Sprintf("%s exited with status %d", commandName, code)) } else {
result.ExitCode = code if exitErr, ok := waitErr.(*exec.ExitError); ok {
result.Error = attachStderr(fmt.Sprintf("%s exited with status %d", commandName, code)) code := exitErr.ExitCode()
logErrorFn(fmt.Sprintf("%s exited with status %d", commandName, code))
result.ExitCode = code
result.Error = attachStderr(fmt.Sprintf("%s exited with status %d", commandName, code))
return result
}
logErrorFn(commandName + " error: " + waitErr.Error())
result.ExitCode = 1
result.Error = attachStderr(commandName + " error: " + waitErr.Error())
return result return result
} }
logErrorFn(commandName + " error: " + waitErr.Error())
result.ExitCode = 1
result.Error = attachStderr(commandName + " error: " + waitErr.Error())
return result
} }
message := parsed.message message := parsed.message

View File

@@ -10,6 +10,7 @@ import (
"os" "os"
"os/exec" "os/exec"
"path/filepath" "path/filepath"
"slices"
"strings" "strings"
"sync" "sync"
"sync/atomic" "sync/atomic"
@@ -86,6 +87,7 @@ type execFakeRunner struct {
process processHandle process processHandle
stdin io.WriteCloser stdin io.WriteCloser
dir string dir string
env map[string]string
waitErr error waitErr error
waitDelay time.Duration waitDelay time.Duration
startErr error startErr error
@@ -128,6 +130,17 @@ func (f *execFakeRunner) StdinPipe() (io.WriteCloser, error) {
} }
func (f *execFakeRunner) SetStderr(io.Writer) {} func (f *execFakeRunner) SetStderr(io.Writer) {}
func (f *execFakeRunner) SetDir(dir string) { f.dir = dir } func (f *execFakeRunner) SetDir(dir string) { f.dir = dir }
func (f *execFakeRunner) SetEnv(env map[string]string) {
if len(env) == 0 {
return
}
if f.env == nil {
f.env = make(map[string]string, len(env))
}
for k, v := range env {
f.env[k] = v
}
}
func (f *execFakeRunner) Process() processHandle { func (f *execFakeRunner) Process() processHandle {
if f.process != nil { if f.process != nil {
return f.process return f.process
@@ -244,6 +257,10 @@ func TestExecutorHelperCoverage(t *testing.T) {
}) })
t.Run("generateFinalOutputAndArgs", func(t *testing.T) { t.Run("generateFinalOutputAndArgs", func(t *testing.T) {
const key = "CODEX_BYPASS_SANDBOX"
t.Cleanup(func() { os.Unsetenv(key) })
os.Unsetenv(key)
out := generateFinalOutput([]TaskResult{ out := generateFinalOutput([]TaskResult{
{TaskID: "ok", ExitCode: 0}, {TaskID: "ok", ExitCode: 0},
{TaskID: "fail", ExitCode: 1, Error: "boom"}, {TaskID: "fail", ExitCode: 1, Error: "boom"},
@@ -251,21 +268,66 @@ func TestExecutorHelperCoverage(t *testing.T) {
if !strings.Contains(out, "ok") || !strings.Contains(out, "fail") { if !strings.Contains(out, "ok") || !strings.Contains(out, "fail") {
t.Fatalf("unexpected summary output: %s", out) t.Fatalf("unexpected summary output: %s", out)
} }
// Test summary mode (default) - should have new format with ### headers
out = generateFinalOutput([]TaskResult{{TaskID: "rich", ExitCode: 0, SessionID: "sess", LogPath: "/tmp/log", Message: "hello"}}) out = generateFinalOutput([]TaskResult{{TaskID: "rich", ExitCode: 0, SessionID: "sess", LogPath: "/tmp/log", Message: "hello"}})
if !strings.Contains(out, "### rich") {
t.Fatalf("summary output missing task header: %s", out)
}
// Test full output mode - should have Session and Message
out = generateFinalOutputWithMode([]TaskResult{{TaskID: "rich", ExitCode: 0, SessionID: "sess", LogPath: "/tmp/log", Message: "hello"}}, false)
if !strings.Contains(out, "Session: sess") || !strings.Contains(out, "Log: /tmp/log") || !strings.Contains(out, "hello") { if !strings.Contains(out, "Session: sess") || !strings.Contains(out, "Log: /tmp/log") || !strings.Contains(out, "hello") {
t.Fatalf("rich output missing fields: %s", out) t.Fatalf("full output missing fields: %s", out)
} }
args := buildCodexArgs(&Config{Mode: "new", WorkDir: "/tmp"}, "task") args := buildCodexArgs(&Config{Mode: "new", WorkDir: "/tmp"}, "task")
if len(args) == 0 || args[3] != "/tmp" { if !slices.Equal(args, []string{"e", "--skip-git-repo-check", "-C", "/tmp", "--json", "task"}) {
t.Fatalf("unexpected codex args: %+v", args) t.Fatalf("unexpected codex args: %+v", args)
} }
args = buildCodexArgs(&Config{Mode: "resume", SessionID: "sess"}, "target") args = buildCodexArgs(&Config{Mode: "resume", SessionID: "sess"}, "target")
if args[3] != "resume" || args[4] != "sess" { if !slices.Equal(args, []string{"e", "--skip-git-repo-check", "--json", "resume", "sess", "target"}) {
t.Fatalf("unexpected resume args: %+v", args) t.Fatalf("unexpected resume args: %+v", args)
} }
}) })
t.Run("generateFinalOutputASCIIMode", func(t *testing.T) {
t.Setenv("CODEAGENT_ASCII_MODE", "true")
results := []TaskResult{
{TaskID: "ok", ExitCode: 0, Coverage: "92%", CoverageNum: 92, CoverageTarget: 90, KeyOutput: "done"},
{TaskID: "warn", ExitCode: 0, Coverage: "80%", CoverageNum: 80, CoverageTarget: 90, KeyOutput: "did"},
{TaskID: "bad", ExitCode: 2, Error: "boom"},
}
out := generateFinalOutput(results)
for _, sym := range []string{"PASS", "WARN", "FAIL"} {
if !strings.Contains(out, sym) {
t.Fatalf("ASCII mode should include %q, got: %s", sym, out)
}
}
for _, sym := range []string{"✓", "⚠️", "✗"} {
if strings.Contains(out, sym) {
t.Fatalf("ASCII mode should not include %q, got: %s", sym, out)
}
}
})
t.Run("generateFinalOutputUnicodeMode", func(t *testing.T) {
t.Setenv("CODEAGENT_ASCII_MODE", "false")
results := []TaskResult{
{TaskID: "ok", ExitCode: 0, Coverage: "92%", CoverageNum: 92, CoverageTarget: 90, KeyOutput: "done"},
{TaskID: "warn", ExitCode: 0, Coverage: "80%", CoverageNum: 80, CoverageTarget: 90, KeyOutput: "did"},
{TaskID: "bad", ExitCode: 2, Error: "boom"},
}
out := generateFinalOutput(results)
for _, sym := range []string{"✓", "⚠️", "✗"} {
if !strings.Contains(out, sym) {
t.Fatalf("Unicode mode should include %q, got: %s", sym, out)
}
}
})
t.Run("executeConcurrentWrapper", func(t *testing.T) { t.Run("executeConcurrentWrapper", func(t *testing.T) {
orig := runCodexTaskFn orig := runCodexTaskFn
defer func() { runCodexTaskFn = orig }() defer func() { runCodexTaskFn = orig }()
@@ -298,6 +360,18 @@ func TestExecutorRunCodexTaskWithContext(t *testing.T) {
origRunner := newCommandRunner origRunner := newCommandRunner
defer func() { newCommandRunner = origRunner }() defer func() { newCommandRunner = origRunner }()
t.Run("resumeMissingSessionID", func(t *testing.T) {
newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
t.Fatalf("unexpected command execution for invalid resume config")
return nil
}
res := runCodexTaskWithContext(context.Background(), TaskSpec{Task: "payload", WorkDir: ".", Mode: "resume"}, nil, nil, false, false, 1)
if res.ExitCode == 0 || !strings.Contains(res.Error, "session_id") {
t.Fatalf("expected validation error, got %+v", res)
}
})
t.Run("success", func(t *testing.T) { t.Run("success", func(t *testing.T) {
var firstStdout *reasonReadCloser var firstStdout *reasonReadCloser
newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner { newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
@@ -1082,9 +1156,10 @@ func TestExecutorExecuteConcurrentWithContextBranches(t *testing.T) {
} }
} }
summary := generateFinalOutput(results) // Test full output mode for shared marker (summary mode doesn't show it)
summary := generateFinalOutputWithMode(results, false)
if !strings.Contains(summary, "(shared)") { if !strings.Contains(summary, "(shared)") {
t.Fatalf("summary missing shared marker: %s", summary) t.Fatalf("full output missing shared marker: %s", summary)
} }
mainLogger.Flush() mainLogger.Flush()

View File

@@ -366,7 +366,8 @@ func (l *Logger) run() {
defer ticker.Stop() defer ticker.Stop()
writeEntry := func(entry logEntry) { writeEntry := func(entry logEntry) {
fmt.Fprintf(l.writer, "%s\n", entry.msg) timestamp := time.Now().Format("2006-01-02 15:04:05.000")
fmt.Fprintf(l.writer, "[%s] %s\n", timestamp, entry.msg)
// Cache error/warn entries in memory for fast extraction // Cache error/warn entries in memory for fast extraction
if entry.isError { if entry.isError {

View File

@@ -14,14 +14,15 @@ import (
) )
const ( const (
version = "5.2.5" version = "5.4.0"
defaultWorkdir = "." defaultWorkdir = "."
defaultTimeout = 7200 // seconds defaultTimeout = 7200 // seconds (2 hours)
codexLogLineLimit = 1000 defaultCoverageTarget = 90.0
stdinSpecialChars = "\n\\\"'`$" codexLogLineLimit = 1000
stderrCaptureLimit = 4 * 1024 stdinSpecialChars = "\n\\\"'`$"
defaultBackendName = "codex" stderrCaptureLimit = 4 * 1024
defaultCodexCommand = "codex" defaultBackendName = "codex"
defaultCodexCommand = "codex"
// stdout close reasons // stdout close reasons
stdoutCloseReasonWait = "wait-done" stdoutCloseReasonWait = "wait-done"
@@ -30,6 +31,8 @@ const (
stdoutDrainTimeout = 100 * time.Millisecond stdoutDrainTimeout = 100 * time.Millisecond
) )
var useASCIIMode = os.Getenv("CODEAGENT_ASCII_MODE") == "true"
// Test hooks for dependency injection // Test hooks for dependency injection
var ( var (
stdinReader io.Reader = os.Stdin stdinReader io.Reader = os.Stdin
@@ -175,6 +178,7 @@ func run() (exitCode int) {
if parallelIndex != -1 { if parallelIndex != -1 {
backendName := defaultBackendName backendName := defaultBackendName
fullOutput := false
var extras []string var extras []string
for i := 0; i < len(args); i++ { for i := 0; i < len(args); i++ {
@@ -182,6 +186,8 @@ func run() (exitCode int) {
switch { switch {
case arg == "--parallel": case arg == "--parallel":
continue continue
case arg == "--full-output":
fullOutput = true
case arg == "--backend": case arg == "--backend":
if i+1 >= len(args) { if i+1 >= len(args) {
fmt.Fprintln(os.Stderr, "ERROR: --backend flag requires a value") fmt.Fprintln(os.Stderr, "ERROR: --backend flag requires a value")
@@ -202,11 +208,12 @@ func run() (exitCode int) {
} }
if len(extras) > 0 { if len(extras) > 0 {
fmt.Fprintln(os.Stderr, "ERROR: --parallel reads its task configuration from stdin; only --backend is allowed.") fmt.Fprintln(os.Stderr, "ERROR: --parallel reads its task configuration from stdin; only --backend and --full-output are allowed.")
fmt.Fprintln(os.Stderr, "Usage examples:") fmt.Fprintln(os.Stderr, "Usage examples:")
fmt.Fprintf(os.Stderr, " %s --parallel < tasks.txt\n", name) fmt.Fprintf(os.Stderr, " %s --parallel < tasks.txt\n", name)
fmt.Fprintf(os.Stderr, " echo '...' | %s --parallel\n", name) fmt.Fprintf(os.Stderr, " echo '...' | %s --parallel\n", name)
fmt.Fprintf(os.Stderr, " %s --parallel <<'EOF'\n", name) fmt.Fprintf(os.Stderr, " %s --parallel <<'EOF'\n", name)
fmt.Fprintf(os.Stderr, " %s --parallel --full-output <<'EOF' # include full task output\n", name)
return 1 return 1
} }
@@ -244,7 +251,33 @@ func run() (exitCode int) {
} }
results := executeConcurrent(layers, timeoutSec) results := executeConcurrent(layers, timeoutSec)
fmt.Println(generateFinalOutput(results))
// Extract structured report fields from each result
for i := range results {
results[i].CoverageTarget = defaultCoverageTarget
if results[i].Message == "" {
continue
}
lines := strings.Split(results[i].Message, "\n")
// Coverage extraction
results[i].Coverage = extractCoverageFromLines(lines)
results[i].CoverageNum = extractCoverageNum(results[i].Coverage)
// Files changed
results[i].FilesChanged = extractFilesChangedFromLines(lines)
// Test results
results[i].TestsPassed, results[i].TestsFailed = extractTestResultsFromLines(lines)
// Key output summary
results[i].KeyOutput = extractKeyOutputFromLines(lines, 150)
}
// Default: summary mode (context-efficient)
// --full-output: legacy full output mode
fmt.Println(generateFinalOutputWithMode(results, !fullOutput))
exitCode = 0 exitCode = 0
for _, res := range results { for _, res := range results {
@@ -447,16 +480,19 @@ Usage:
%[1]s resume <session_id> "task" [workdir] %[1]s resume <session_id> "task" [workdir]
%[1]s resume <session_id> - [workdir] %[1]s resume <session_id> - [workdir]
%[1]s --parallel Run tasks in parallel (config from stdin) %[1]s --parallel Run tasks in parallel (config from stdin)
%[1]s --parallel --full-output Run tasks in parallel with full output (legacy)
%[1]s --version %[1]s --version
%[1]s --help %[1]s --help
Parallel mode examples: Parallel mode examples:
%[1]s --parallel < tasks.txt %[1]s --parallel < tasks.txt
echo '...' | %[1]s --parallel echo '...' | %[1]s --parallel
%[1]s --parallel --full-output < tasks.txt
%[1]s --parallel <<'EOF' %[1]s --parallel <<'EOF'
Environment Variables: Environment Variables:
CODEX_TIMEOUT Timeout in milliseconds (default: 7200000) CODEX_TIMEOUT Timeout in milliseconds (default: 7200000)
CODEAGENT_ASCII_MODE Use ASCII symbols instead of Unicode (PASS/WARN/FAIL)
Exit Codes: Exit Codes:
0 Success 0 Success

View File

@@ -46,10 +46,26 @@ func parseIntegrationOutput(t *testing.T, out string) integrationOutput {
lines := strings.Split(out, "\n") lines := strings.Split(out, "\n")
var currentTask *TaskResult var currentTask *TaskResult
inTaskResults := false
for _, line := range lines { for _, line := range lines {
line = strings.TrimSpace(line) line = strings.TrimSpace(line)
if strings.HasPrefix(line, "Total:") {
// Parse new format header: "X tasks | Y passed | Z failed"
if strings.Contains(line, "tasks |") && strings.Contains(line, "passed |") {
parts := strings.Split(line, "|")
for _, p := range parts {
p = strings.TrimSpace(p)
if strings.HasSuffix(p, "tasks") {
fmt.Sscanf(p, "%d tasks", &payload.Summary.Total)
} else if strings.HasSuffix(p, "passed") {
fmt.Sscanf(p, "%d passed", &payload.Summary.Success)
} else if strings.HasSuffix(p, "failed") {
fmt.Sscanf(p, "%d failed", &payload.Summary.Failed)
}
}
} else if strings.HasPrefix(line, "Total:") {
// Legacy format: "Total: X | Success: Y | Failed: Z"
parts := strings.Split(line, "|") parts := strings.Split(line, "|")
for _, p := range parts { for _, p := range parts {
p = strings.TrimSpace(p) p = strings.TrimSpace(p)
@@ -61,13 +77,72 @@ func parseIntegrationOutput(t *testing.T, out string) integrationOutput {
fmt.Sscanf(p, "Failed: %d", &payload.Summary.Failed) fmt.Sscanf(p, "Failed: %d", &payload.Summary.Failed)
} }
} }
} else if line == "## Task Results" {
inTaskResults = true
} else if line == "## Summary" {
// End of task results section
if currentTask != nil {
payload.Results = append(payload.Results, *currentTask)
currentTask = nil
}
inTaskResults = false
} else if inTaskResults && strings.HasPrefix(line, "### ") {
// New task: ### task-id ✓ 92% or ### task-id PASS 92% (ASCII mode)
if currentTask != nil {
payload.Results = append(payload.Results, *currentTask)
}
currentTask = &TaskResult{}
taskLine := strings.TrimPrefix(line, "### ")
success, warning, failed := getStatusSymbols()
// Parse different formats
if strings.Contains(taskLine, " "+success) {
parts := strings.Split(taskLine, " "+success)
currentTask.TaskID = strings.TrimSpace(parts[0])
currentTask.ExitCode = 0
// Extract coverage if present
if len(parts) > 1 {
coveragePart := strings.TrimSpace(parts[1])
if strings.HasSuffix(coveragePart, "%") {
currentTask.Coverage = coveragePart
}
}
} else if strings.Contains(taskLine, " "+warning) {
parts := strings.Split(taskLine, " "+warning)
currentTask.TaskID = strings.TrimSpace(parts[0])
currentTask.ExitCode = 0
} else if strings.Contains(taskLine, " "+failed) {
parts := strings.Split(taskLine, " "+failed)
currentTask.TaskID = strings.TrimSpace(parts[0])
currentTask.ExitCode = 1
} else {
currentTask.TaskID = taskLine
}
} else if currentTask != nil && inTaskResults {
// Parse task details
if strings.HasPrefix(line, "Exit code:") {
fmt.Sscanf(line, "Exit code: %d", &currentTask.ExitCode)
} else if strings.HasPrefix(line, "Error:") {
currentTask.Error = strings.TrimPrefix(line, "Error: ")
} else if strings.HasPrefix(line, "Log:") {
currentTask.LogPath = strings.TrimSpace(strings.TrimPrefix(line, "Log:"))
} else if strings.HasPrefix(line, "Did:") {
currentTask.KeyOutput = strings.TrimSpace(strings.TrimPrefix(line, "Did:"))
} else if strings.HasPrefix(line, "Detail:") {
// Error detail for failed tasks
if currentTask.Message == "" {
currentTask.Message = strings.TrimSpace(strings.TrimPrefix(line, "Detail:"))
}
}
} else if strings.HasPrefix(line, "--- Task:") { } else if strings.HasPrefix(line, "--- Task:") {
// Legacy full output format
if currentTask != nil { if currentTask != nil {
payload.Results = append(payload.Results, *currentTask) payload.Results = append(payload.Results, *currentTask)
} }
currentTask = &TaskResult{} currentTask = &TaskResult{}
currentTask.TaskID = strings.TrimSuffix(strings.TrimPrefix(line, "--- Task: "), " ---") currentTask.TaskID = strings.TrimSuffix(strings.TrimPrefix(line, "--- Task: "), " ---")
} else if currentTask != nil { } else if currentTask != nil && !inTaskResults {
// Legacy format parsing
if strings.HasPrefix(line, "Status: SUCCESS") { if strings.HasPrefix(line, "Status: SUCCESS") {
currentTask.ExitCode = 0 currentTask.ExitCode = 0
} else if strings.HasPrefix(line, "Status: FAILED") { } else if strings.HasPrefix(line, "Status: FAILED") {
@@ -82,15 +157,11 @@ func parseIntegrationOutput(t *testing.T, out string) integrationOutput {
currentTask.SessionID = strings.TrimPrefix(line, "Session: ") currentTask.SessionID = strings.TrimPrefix(line, "Session: ")
} else if strings.HasPrefix(line, "Log:") { } else if strings.HasPrefix(line, "Log:") {
currentTask.LogPath = strings.TrimSpace(strings.TrimPrefix(line, "Log:")) currentTask.LogPath = strings.TrimSpace(strings.TrimPrefix(line, "Log:"))
} else if line != "" && !strings.HasPrefix(line, "===") && !strings.HasPrefix(line, "---") {
if currentTask.Message != "" {
currentTask.Message += "\n"
}
currentTask.Message += line
} }
} }
} }
// Handle last task
if currentTask != nil { if currentTask != nil {
payload.Results = append(payload.Results, *currentTask) payload.Results = append(payload.Results, *currentTask)
} }
@@ -343,9 +414,10 @@ task-beta`
} }
for _, id := range []string{"alpha", "beta"} { for _, id := range []string{"alpha", "beta"} {
want := fmt.Sprintf("Log: %s", logPathFor(id)) // Summary mode shows log paths in table format, not "Log: xxx"
if !strings.Contains(output, want) { logPath := logPathFor(id)
t.Fatalf("parallel output missing %q for %s:\n%s", want, id, output) if !strings.Contains(output, logPath) {
t.Fatalf("parallel output missing log path %q for %s:\n%s", logPath, id, output)
} }
} }
} }
@@ -550,16 +622,16 @@ ok-e`
if resD.LogPath != logPathFor("D") || resE.LogPath != logPathFor("E") { if resD.LogPath != logPathFor("D") || resE.LogPath != logPathFor("E") {
t.Fatalf("expected log paths for D/E, got D=%q E=%q", resD.LogPath, resE.LogPath) t.Fatalf("expected log paths for D/E, got D=%q E=%q", resD.LogPath, resE.LogPath)
} }
// Summary mode shows log paths in table, verify they appear in output
for _, id := range []string{"A", "D", "E"} { for _, id := range []string{"A", "D", "E"} {
block := extractTaskBlock(t, output, id) logPath := logPathFor(id)
want := fmt.Sprintf("Log: %s", logPathFor(id)) if !strings.Contains(output, logPath) {
if !strings.Contains(block, want) { t.Fatalf("task %s log path %q not found in output:\n%s", id, logPath, output)
t.Fatalf("task %s block missing %q:\n%s", id, want, block)
} }
} }
blockB := extractTaskBlock(t, output, "B") // Task B was skipped, should have "-" or empty log path in table
if strings.Contains(blockB, "Log:") { if resB.LogPath != "" {
t.Fatalf("skipped task B should not emit a log line:\n%s", blockB) t.Fatalf("skipped task B should have empty log path, got %q", resB.LogPath)
} }
} }

View File

@@ -255,6 +255,10 @@ func (d *drainBlockingCmd) SetDir(dir string) {
d.inner.SetDir(dir) d.inner.SetDir(dir)
} }
func (d *drainBlockingCmd) SetEnv(env map[string]string) {
d.inner.SetEnv(env)
}
func (d *drainBlockingCmd) Process() processHandle { func (d *drainBlockingCmd) Process() processHandle {
return d.inner.Process() return d.inner.Process()
} }
@@ -387,6 +391,8 @@ type fakeCmd struct {
stderr io.Writer stderr io.Writer
env map[string]string
waitDelay time.Duration waitDelay time.Duration
waitErr error waitErr error
startErr error startErr error
@@ -511,6 +517,20 @@ func (f *fakeCmd) SetStderr(w io.Writer) {
func (f *fakeCmd) SetDir(string) {} func (f *fakeCmd) SetDir(string) {}
func (f *fakeCmd) SetEnv(env map[string]string) {
if len(env) == 0 {
return
}
f.mu.Lock()
defer f.mu.Unlock()
if f.env == nil {
f.env = make(map[string]string, len(env))
}
for k, v := range env {
f.env[k] = v
}
}
func (f *fakeCmd) Process() processHandle { func (f *fakeCmd) Process() processHandle {
if f == nil { if f == nil {
return nil return nil
@@ -879,6 +899,79 @@ func TestRunCodexTask_ContextTimeout(t *testing.T) {
} }
} }
func TestRunCodexTask_ForcesStopAfterCompletion(t *testing.T) {
defer resetTestHooks()
forceKillDelay.Store(0)
fake := newFakeCmd(fakeCmdConfig{
StdoutPlan: []fakeStdoutEvent{
{Data: `{"type":"item.completed","item":{"type":"agent_message","text":"done"}}` + "\n"},
{Data: `{"type":"thread.completed","thread_id":"tid"}` + "\n"},
},
KeepStdoutOpen: true,
BlockWait: true,
ReleaseWaitOnSignal: true,
ReleaseWaitOnKill: true,
})
newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
return fake
}
buildCodexArgsFn = func(cfg *Config, targetArg string) []string { return []string{targetArg} }
codexCommand = "fake-cmd"
start := time.Now()
result := runCodexTaskWithContext(context.Background(), TaskSpec{Task: "done", WorkDir: defaultWorkdir}, nil, nil, false, false, 60)
duration := time.Since(start)
if result.ExitCode != 0 || result.Message != "done" {
t.Fatalf("unexpected result: %+v", result)
}
if duration > 2*time.Second {
t.Fatalf("runCodexTaskWithContext took too long: %v", duration)
}
if fake.process.SignalCount() == 0 {
t.Fatalf("expected SIGTERM to be sent, got %d", fake.process.SignalCount())
}
}
func TestRunCodexTask_DoesNotTerminateBeforeThreadCompleted(t *testing.T) {
defer resetTestHooks()
forceKillDelay.Store(0)
fake := newFakeCmd(fakeCmdConfig{
StdoutPlan: []fakeStdoutEvent{
{Data: `{"type":"item.completed","item":{"type":"agent_message","text":"intermediate"}}` + "\n"},
{Delay: 1100 * time.Millisecond, Data: `{"type":"item.completed","item":{"type":"agent_message","text":"final"}}` + "\n"},
{Data: `{"type":"thread.completed","thread_id":"tid"}` + "\n"},
},
KeepStdoutOpen: true,
BlockWait: true,
ReleaseWaitOnSignal: true,
ReleaseWaitOnKill: true,
})
newCommandRunner = func(ctx context.Context, name string, args ...string) commandRunner {
return fake
}
buildCodexArgsFn = func(cfg *Config, targetArg string) []string { return []string{targetArg} }
codexCommand = "fake-cmd"
start := time.Now()
result := runCodexTaskWithContext(context.Background(), TaskSpec{Task: "done", WorkDir: defaultWorkdir}, nil, nil, false, false, 60)
duration := time.Since(start)
if result.ExitCode != 0 || result.Message != "final" {
t.Fatalf("unexpected result: %+v", result)
}
if duration > 5*time.Second {
t.Fatalf("runCodexTaskWithContext took too long: %v", duration)
}
if fake.process.SignalCount() == 0 {
t.Fatalf("expected SIGTERM to be sent, got %d", fake.process.SignalCount())
}
}
func TestBackendParseArgs_NewMode(t *testing.T) { func TestBackendParseArgs_NewMode(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
@@ -965,6 +1058,8 @@ func TestBackendParseArgs_ResumeMode(t *testing.T) {
}, },
{name: "resume missing session_id", args: []string{"codeagent-wrapper", "resume"}, wantErr: true}, {name: "resume missing session_id", args: []string{"codeagent-wrapper", "resume"}, wantErr: true},
{name: "resume missing task", args: []string{"codeagent-wrapper", "resume", "session-123"}, wantErr: true}, {name: "resume missing task", args: []string{"codeagent-wrapper", "resume", "session-123"}, wantErr: true},
{name: "resume empty session_id", args: []string{"codeagent-wrapper", "resume", "", "task"}, wantErr: true},
{name: "resume whitespace session_id", args: []string{"codeagent-wrapper", "resume", " ", "task"}, wantErr: true},
} }
for _, tt := range tests { for _, tt := range tests {
@@ -1181,6 +1276,18 @@ do something`
} }
} }
func TestParallelParseConfig_EmptySessionID(t *testing.T) {
input := `---TASK---
id: task-1
session_id:
---CONTENT---
do something`
if _, err := parseParallelConfig([]byte(input)); err == nil {
t.Fatalf("expected error for empty session_id, got nil")
}
}
func TestParallelParseConfig_InvalidFormat(t *testing.T) { func TestParallelParseConfig_InvalidFormat(t *testing.T) {
if _, err := parseParallelConfig([]byte("invalid format")); err == nil { if _, err := parseParallelConfig([]byte("invalid format")); err == nil {
t.Fatalf("expected error for invalid format, got nil") t.Fatalf("expected error for invalid format, got nil")
@@ -1281,9 +1388,19 @@ func TestRunShouldUseStdin(t *testing.T) {
} }
func TestRunBuildCodexArgs_NewMode(t *testing.T) { func TestRunBuildCodexArgs_NewMode(t *testing.T) {
const key = "CODEX_BYPASS_SANDBOX"
t.Cleanup(func() { os.Unsetenv(key) })
os.Unsetenv(key)
cfg := &Config{Mode: "new", WorkDir: "/test/dir"} cfg := &Config{Mode: "new", WorkDir: "/test/dir"}
args := buildCodexArgs(cfg, "my task") args := buildCodexArgs(cfg, "my task")
expected := []string{"e", "--skip-git-repo-check", "-C", "/test/dir", "--json", "my task"} expected := []string{
"e",
"--skip-git-repo-check",
"-C", "/test/dir",
"--json",
"my task",
}
if len(args) != len(expected) { if len(args) != len(expected) {
t.Fatalf("len mismatch") t.Fatalf("len mismatch")
} }
@@ -1295,9 +1412,20 @@ func TestRunBuildCodexArgs_NewMode(t *testing.T) {
} }
func TestRunBuildCodexArgs_ResumeMode(t *testing.T) { func TestRunBuildCodexArgs_ResumeMode(t *testing.T) {
const key = "CODEX_BYPASS_SANDBOX"
t.Cleanup(func() { os.Unsetenv(key) })
os.Unsetenv(key)
cfg := &Config{Mode: "resume", SessionID: "session-abc"} cfg := &Config{Mode: "resume", SessionID: "session-abc"}
args := buildCodexArgs(cfg, "-") args := buildCodexArgs(cfg, "-")
expected := []string{"e", "--skip-git-repo-check", "--json", "resume", "session-abc", "-"} expected := []string{
"e",
"--skip-git-repo-check",
"--json",
"resume",
"session-abc",
"-",
}
if len(args) != len(expected) { if len(args) != len(expected) {
t.Fatalf("len mismatch") t.Fatalf("len mismatch")
} }
@@ -1308,6 +1436,61 @@ func TestRunBuildCodexArgs_ResumeMode(t *testing.T) {
} }
} }
func TestRunBuildCodexArgs_ResumeMode_EmptySessionHandledGracefully(t *testing.T) {
const key = "CODEX_BYPASS_SANDBOX"
t.Cleanup(func() { os.Unsetenv(key) })
os.Unsetenv(key)
cfg := &Config{Mode: "resume", SessionID: " ", WorkDir: "/test/dir"}
args := buildCodexArgs(cfg, "task")
expected := []string{"e", "--skip-git-repo-check", "-C", "/test/dir", "--json", "task"}
if len(args) != len(expected) {
t.Fatalf("len mismatch")
}
for i := range args {
if args[i] != expected[i] {
t.Fatalf("args[%d]=%s, want %s", i, args[i], expected[i])
}
}
}
func TestRunBuildCodexArgs_BypassSandboxEnvTrue(t *testing.T) {
defer resetTestHooks()
tempDir := t.TempDir()
t.Setenv("TMPDIR", tempDir)
logger, err := NewLogger()
if err != nil {
t.Fatalf("NewLogger() error = %v", err)
}
setLogger(logger)
defer closeLogger()
t.Setenv("CODEX_BYPASS_SANDBOX", "true")
cfg := &Config{Mode: "new", WorkDir: "/test/dir"}
args := buildCodexArgs(cfg, "my task")
found := false
for _, arg := range args {
if arg == "--dangerously-bypass-approvals-and-sandbox" {
found = true
break
}
}
if !found {
t.Fatalf("expected bypass flag in args, got %v", args)
}
logger.Flush()
data, err := os.ReadFile(logger.Path())
if err != nil {
t.Fatalf("failed to read log file: %v", err)
}
if !strings.Contains(string(data), "CODEX_BYPASS_SANDBOX=true") {
t.Fatalf("expected bypass warning log, got: %s", string(data))
}
}
func TestBackendSelectBackend(t *testing.T) { func TestBackendSelectBackend(t *testing.T) {
tests := []struct { tests := []struct {
name string name string
@@ -1363,7 +1546,13 @@ func TestBackendBuildArgs_CodexBackend(t *testing.T) {
backend := CodexBackend{} backend := CodexBackend{}
cfg := &Config{Mode: "new", WorkDir: "/test/dir"} cfg := &Config{Mode: "new", WorkDir: "/test/dir"}
got := backend.BuildArgs(cfg, "task") got := backend.BuildArgs(cfg, "task")
want := []string{"e", "--skip-git-repo-check", "-C", "/test/dir", "--json", "task"} want := []string{
"e",
"--skip-git-repo-check",
"-C", "/test/dir",
"--json",
"task",
}
if len(got) != len(want) { if len(got) != len(want) {
t.Fatalf("length mismatch") t.Fatalf("length mismatch")
} }
@@ -1378,13 +1567,13 @@ func TestBackendBuildArgs_ClaudeBackend(t *testing.T) {
backend := ClaudeBackend{} backend := ClaudeBackend{}
cfg := &Config{Mode: "new", WorkDir: defaultWorkdir} cfg := &Config{Mode: "new", WorkDir: defaultWorkdir}
got := backend.BuildArgs(cfg, "todo") got := backend.BuildArgs(cfg, "todo")
want := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"} want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", "todo"}
if len(got) != len(want) { if len(got) != len(want) {
t.Fatalf("length mismatch") t.Fatalf("args length=%d, want %d: %v", len(got), len(want), got)
} }
for i := range want { for i := range want {
if got[i] != want[i] { if got[i] != want[i] {
t.Fatalf("index %d got %s want %s", i, got[i], want[i]) t.Fatalf("index %d got %q want %q (args=%v)", i, got[i], want[i], got)
} }
} }
@@ -1399,19 +1588,15 @@ func TestClaudeBackendBuildArgs_OutputValidation(t *testing.T) {
target := "ensure-flags" target := "ensure-flags"
args := backend.BuildArgs(cfg, target) args := backend.BuildArgs(cfg, target)
expectedPrefix := []string{"-p", "--dangerously-skip-permissions", "--setting-sources", "", "--output-format", "stream-json", "--verbose"} want := []string{"-p", "--setting-sources", "", "--output-format", "stream-json", "--verbose", target}
if len(args) != len(want) {
if len(args) != len(expectedPrefix)+1 { t.Fatalf("args length=%d, want %d: %v", len(args), len(want), args)
t.Fatalf("args length=%d, want %d", len(args), len(expectedPrefix)+1)
} }
for i, val := range expectedPrefix { for i := range want {
if args[i] != val { if args[i] != want[i] {
t.Fatalf("args[%d]=%q, want %q", i, args[i], val) t.Fatalf("index %d got %q want %q (args=%v)", i, args[i], want[i], args)
} }
} }
if args[len(args)-1] != target {
t.Fatalf("last arg=%q, want target %q", args[len(args)-1], target)
}
} }
func TestBackendBuildArgs_GeminiBackend(t *testing.T) { func TestBackendBuildArgs_GeminiBackend(t *testing.T) {
@@ -1650,7 +1835,7 @@ func TestBackendParseJSONStream_GeminiEvents_OnMessageTriggeredOnStatus(t *testi
var called int var called int
message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() { message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
called++ called++
}) }, nil)
if message != "Hi there" { if message != "Hi there" {
t.Fatalf("message=%q, want %q", message, "Hi there") t.Fatalf("message=%q, want %q", message, "Hi there")
@@ -1679,7 +1864,7 @@ func TestBackendParseJSONStream_OnMessage(t *testing.T) {
var called int var called int
message, threadID := parseJSONStreamInternal(strings.NewReader(`{"type":"item.completed","item":{"type":"agent_message","text":"hook"}}`), nil, nil, func() { message, threadID := parseJSONStreamInternal(strings.NewReader(`{"type":"item.completed","item":{"type":"agent_message","text":"hook"}}`), nil, nil, func() {
called++ called++
}) }, nil)
if message != "hook" { if message != "hook" {
t.Fatalf("message = %q, want hook", message) t.Fatalf("message = %q, want hook", message)
} }
@@ -1691,10 +1876,86 @@ func TestBackendParseJSONStream_OnMessage(t *testing.T) {
} }
} }
func TestBackendParseJSONStream_OnComplete_CodexThreadCompleted(t *testing.T) {
input := `{"type":"item.completed","item":{"type":"agent_message","text":"first"}}` + "\n" +
`{"type":"item.completed","item":{"type":"agent_message","text":"second"}}` + "\n" +
`{"type":"thread.completed","thread_id":"t-1"}`
var onMessageCalls int
var onCompleteCalls int
message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
onMessageCalls++
}, func() {
onCompleteCalls++
})
if message != "second" {
t.Fatalf("message = %q, want second", message)
}
if threadID != "t-1" {
t.Fatalf("threadID = %q, want t-1", threadID)
}
if onMessageCalls != 2 {
t.Fatalf("onMessage calls = %d, want 2", onMessageCalls)
}
if onCompleteCalls != 1 {
t.Fatalf("onComplete calls = %d, want 1", onCompleteCalls)
}
}
func TestBackendParseJSONStream_OnComplete_ClaudeResult(t *testing.T) {
input := `{"type":"message","subtype":"stream","session_id":"s-1"}` + "\n" +
`{"type":"result","result":"OK","session_id":"s-1"}`
var onMessageCalls int
var onCompleteCalls int
message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
onMessageCalls++
}, func() {
onCompleteCalls++
})
if message != "OK" {
t.Fatalf("message = %q, want OK", message)
}
if threadID != "s-1" {
t.Fatalf("threadID = %q, want s-1", threadID)
}
if onMessageCalls != 1 {
t.Fatalf("onMessage calls = %d, want 1", onMessageCalls)
}
if onCompleteCalls != 1 {
t.Fatalf("onComplete calls = %d, want 1", onCompleteCalls)
}
}
func TestBackendParseJSONStream_OnComplete_GeminiTerminalResultStatus(t *testing.T) {
input := `{"type":"message","role":"assistant","content":"Hi","delta":true,"session_id":"g-1"}` + "\n" +
`{"type":"result","status":"success","session_id":"g-1"}`
var onMessageCalls int
var onCompleteCalls int
message, threadID := parseJSONStreamInternal(strings.NewReader(input), nil, nil, func() {
onMessageCalls++
}, func() {
onCompleteCalls++
})
if message != "Hi" {
t.Fatalf("message = %q, want Hi", message)
}
if threadID != "g-1" {
t.Fatalf("threadID = %q, want g-1", threadID)
}
if onMessageCalls != 1 {
t.Fatalf("onMessage calls = %d, want 1", onMessageCalls)
}
if onCompleteCalls != 1 {
t.Fatalf("onComplete calls = %d, want 1", onCompleteCalls)
}
}
func TestBackendParseJSONStream_ScannerError(t *testing.T) { func TestBackendParseJSONStream_ScannerError(t *testing.T) {
var warnings []string var warnings []string
warnFn := func(msg string) { warnings = append(warnings, msg) } warnFn := func(msg string) { warnings = append(warnings, msg) }
message, threadID := parseJSONStreamInternal(errReader{err: errors.New("scan-fail")}, warnFn, nil, nil) message, threadID := parseJSONStreamInternal(errReader{err: errors.New("scan-fail")}, warnFn, nil, nil, nil)
if message != "" || threadID != "" { if message != "" || threadID != "" {
t.Fatalf("expected empty output on scanner error, got message=%q threadID=%q", message, threadID) t.Fatalf("expected empty output on scanner error, got message=%q threadID=%q", message, threadID)
} }
@@ -2372,14 +2633,17 @@ func TestRunGenerateFinalOutput(t *testing.T) {
if out == "" { if out == "" {
t.Fatalf("generateFinalOutput() returned empty string") t.Fatalf("generateFinalOutput() returned empty string")
} }
if !strings.Contains(out, "Total: 3") || !strings.Contains(out, "Success: 2") || !strings.Contains(out, "Failed: 1") { // New format: "X tasks | Y passed | Z failed"
if !strings.Contains(out, "3 tasks") || !strings.Contains(out, "2 passed") || !strings.Contains(out, "1 failed") {
t.Fatalf("summary missing, got %q", out) t.Fatalf("summary missing, got %q", out)
} }
if !strings.Contains(out, "Task: a") || !strings.Contains(out, "Task: b") { // New format uses ### task-id for each task
t.Fatalf("task entries missing") if !strings.Contains(out, "### a") || !strings.Contains(out, "### b") {
t.Fatalf("task entries missing in structured format")
} }
if strings.Contains(out, "Log:") { // Should have Summary section
t.Fatalf("unexpected log line when LogPath empty, got %q", out) if !strings.Contains(out, "## Summary") {
t.Fatalf("Summary section missing, got %q", out)
} }
} }
@@ -2399,12 +2663,18 @@ func TestRunGenerateFinalOutput_LogPath(t *testing.T) {
LogPath: "/tmp/log-b", LogPath: "/tmp/log-b",
}, },
} }
// Test summary mode (default) - should contain log paths
out := generateFinalOutput(results) out := generateFinalOutput(results)
if !strings.Contains(out, "Session: sid\nLog: /tmp/log-a") { if !strings.Contains(out, "/tmp/log-b") {
t.Fatalf("output missing log line after session: %q", out) t.Fatalf("summary output missing log path for failed task: %q", out)
}
// Test full output mode - shows Session: and Log: lines
out = generateFinalOutputWithMode(results, false)
if !strings.Contains(out, "Session: sid") || !strings.Contains(out, "Log: /tmp/log-a") {
t.Fatalf("full output missing log line after session: %q", out)
} }
if !strings.Contains(out, "Log: /tmp/log-b") { if !strings.Contains(out, "Log: /tmp/log-b") {
t.Fatalf("output missing log line for failed task: %q", out) t.Fatalf("full output missing log line for failed task: %q", out)
} }
} }
@@ -2702,6 +2972,46 @@ test`
} }
} }
func TestRunParallelWithFullOutput(t *testing.T) {
defer resetTestHooks()
cleanupLogsFn = func() (CleanupStats, error) { return CleanupStats{}, nil }
oldArgs := os.Args
t.Cleanup(func() { os.Args = oldArgs })
os.Args = []string{"codeagent-wrapper", "--parallel", "--full-output"}
stdinReader = strings.NewReader(`---TASK---
id: T1
---CONTENT---
noop`)
t.Cleanup(func() { stdinReader = os.Stdin })
orig := runCodexTaskFn
runCodexTaskFn = func(task TaskSpec, timeout int) TaskResult {
return TaskResult{TaskID: task.ID, ExitCode: 0, Message: "full output marker"}
}
t.Cleanup(func() { runCodexTaskFn = orig })
out := captureOutput(t, func() {
if code := run(); code != 0 {
t.Fatalf("run exit = %d, want 0", code)
}
})
if !strings.Contains(out, "=== Parallel Execution Summary ===") {
t.Fatalf("output missing full-output header, got %q", out)
}
if !strings.Contains(out, "--- Task: T1 ---") {
t.Fatalf("output missing task block, got %q", out)
}
if !strings.Contains(out, "full output marker") {
t.Fatalf("output missing task message, got %q", out)
}
if strings.Contains(out, "=== Execution Report ===") {
t.Fatalf("output should not include summary-only header, got %q", out)
}
}
func TestParallelInvalidBackend(t *testing.T) { func TestParallelInvalidBackend(t *testing.T) {
defer resetTestHooks() defer resetTestHooks()
cleanupLogsFn = func() (CleanupStats, error) { return CleanupStats{}, nil } cleanupLogsFn = func() (CleanupStats, error) { return CleanupStats{}, nil }
@@ -2756,7 +3066,9 @@ func TestVersionFlag(t *testing.T) {
t.Errorf("exit = %d, want 0", code) t.Errorf("exit = %d, want 0", code)
} }
}) })
want := "codeagent-wrapper version 5.2.5\n"
want := "codeagent-wrapper version 5.4.0\n"
if output != want { if output != want {
t.Fatalf("output = %q, want %q", output, want) t.Fatalf("output = %q, want %q", output, want)
} }
@@ -2770,7 +3082,9 @@ func TestVersionShortFlag(t *testing.T) {
t.Errorf("exit = %d, want 0", code) t.Errorf("exit = %d, want 0", code)
} }
}) })
want := "codeagent-wrapper version 5.2.5\n"
want := "codeagent-wrapper version 5.4.0\n"
if output != want { if output != want {
t.Fatalf("output = %q, want %q", output, want) t.Fatalf("output = %q, want %q", output, want)
} }
@@ -2784,7 +3098,9 @@ func TestVersionLegacyAlias(t *testing.T) {
t.Errorf("exit = %d, want 0", code) t.Errorf("exit = %d, want 0", code)
} }
}) })
want := "codex-wrapper version 5.2.5\n"
want := "codex-wrapper version 5.4.0\n"
if output != want { if output != want {
t.Fatalf("output = %q, want %q", output, want) t.Fatalf("output = %q, want %q", output, want)
} }

View File

@@ -50,7 +50,7 @@ func parseJSONStreamWithWarn(r io.Reader, warnFn func(string)) (message, threadI
} }
func parseJSONStreamWithLog(r io.Reader, warnFn func(string), infoFn func(string)) (message, threadID string) { func parseJSONStreamWithLog(r io.Reader, warnFn func(string), infoFn func(string)) (message, threadID string) {
return parseJSONStreamInternal(r, warnFn, infoFn, nil) return parseJSONStreamInternal(r, warnFn, infoFn, nil, nil)
} }
const ( const (
@@ -95,7 +95,7 @@ type ItemContent struct {
Text interface{} `json:"text"` Text interface{} `json:"text"`
} }
func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(string), onMessage func()) (message, threadID string) { func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(string), onMessage func(), onComplete func()) (message, threadID string) {
reader := bufio.NewReaderSize(r, jsonLineReaderSize) reader := bufio.NewReaderSize(r, jsonLineReaderSize)
if warnFn == nil { if warnFn == nil {
@@ -111,6 +111,12 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
} }
} }
notifyComplete := func() {
if onComplete != nil {
onComplete()
}
}
totalEvents := 0 totalEvents := 0
var ( var (
@@ -158,6 +164,9 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
} }
} }
isClaude := event.Subtype != "" || event.Result != "" isClaude := event.Subtype != "" || event.Result != ""
if !isClaude && event.Type == "result" && event.SessionID != "" && event.Status == "" {
isClaude = true
}
isGemini := event.Role != "" || event.Delta != nil || event.Status != "" isGemini := event.Role != "" || event.Delta != nil || event.Status != ""
// Handle Codex events // Handle Codex events
@@ -178,6 +187,13 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
threadID = event.ThreadID threadID = event.ThreadID
infoFn(fmt.Sprintf("thread.started event thread_id=%s", threadID)) infoFn(fmt.Sprintf("thread.started event thread_id=%s", threadID))
case "thread.completed":
if event.ThreadID != "" && threadID == "" {
threadID = event.ThreadID
}
infoFn(fmt.Sprintf("thread.completed event thread_id=%s", event.ThreadID))
notifyComplete()
case "item.completed": case "item.completed":
var itemType string var itemType string
if len(event.Item) > 0 { if len(event.Item) > 0 {
@@ -221,6 +237,10 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
claudeMessage = event.Result claudeMessage = event.Result
notifyMessage() notifyMessage()
} }
if event.Type == "result" {
notifyComplete()
}
continue continue
} }
@@ -236,6 +256,10 @@ func parseJSONStreamInternal(r io.Reader, warnFn func(string), infoFn func(strin
if event.Status != "" { if event.Status != "" {
notifyMessage() notifyMessage()
if event.Type == "result" && (event.Status == "success" || event.Status == "error" || event.Status == "complete" || event.Status == "failed") {
notifyComplete()
}
} }
delta := false delta := false

View File

@@ -18,7 +18,7 @@ func TestParseJSONStream_SkipsOverlongLineAndContinues(t *testing.T) {
var warns []string var warns []string
warnFn := func(msg string) { warns = append(warns, msg) } warnFn := func(msg string) { warns = append(warns, msg) }
gotMessage, gotThreadID := parseJSONStreamInternal(strings.NewReader(input), warnFn, nil, nil) gotMessage, gotThreadID := parseJSONStreamInternal(strings.NewReader(input), warnFn, nil, nil, nil)
if gotMessage != "ok" { if gotMessage != "ok" {
t.Fatalf("message=%q, want %q (warns=%v)", gotMessage, "ok", warns) t.Fatalf("message=%q, want %q (warns=%v)", gotMessage, "ok", warns)
} }

View File

@@ -75,9 +75,9 @@ func getEnv(key, defaultValue string) string {
} }
type logWriter struct { type logWriter struct {
prefix string prefix string
maxLen int maxLen int
buf bytes.Buffer buf bytes.Buffer
dropped bool dropped bool
} }
@@ -205,6 +205,55 @@ func truncate(s string, maxLen int) string {
return s[:maxLen] + "..." return s[:maxLen] + "..."
} }
// safeTruncate safely truncates string to maxLen, avoiding panic and UTF-8 corruption.
func safeTruncate(s string, maxLen int) string {
if maxLen <= 0 || s == "" {
return ""
}
runes := []rune(s)
if len(runes) <= maxLen {
return s
}
if maxLen < 4 {
return string(runes[:1])
}
cutoff := maxLen - 3
if cutoff <= 0 {
return string(runes[:1])
}
if len(runes) <= cutoff {
return s
}
return string(runes[:cutoff]) + "..."
}
// sanitizeOutput removes ANSI escape sequences and control characters.
func sanitizeOutput(s string) string {
var result strings.Builder
inEscape := false
for i := 0; i < len(s); i++ {
if s[i] == '\x1b' && i+1 < len(s) && s[i+1] == '[' {
inEscape = true
i++ // skip '['
continue
}
if inEscape {
if (s[i] >= 'A' && s[i] <= 'Z') || (s[i] >= 'a' && s[i] <= 'z') {
inEscape = false
}
continue
}
// Keep printable chars and common whitespace.
if s[i] >= 32 || s[i] == '\n' || s[i] == '\t' {
result.WriteByte(s[i])
}
}
return result.String()
}
func min(a, b int) int { func min(a, b int) int {
if a < b { if a < b {
return a return a
@@ -223,3 +272,444 @@ func greet(name string) string {
func farewell(name string) string { func farewell(name string) string {
return "goodbye " + name return "goodbye " + name
} }
// extractMessageSummary extracts a brief summary from task output
// Returns first meaningful line or truncated content up to maxLen chars
func extractMessageSummary(message string, maxLen int) string {
if message == "" || maxLen <= 0 {
return ""
}
// Try to find a meaningful summary line
lines := strings.Split(message, "\n")
for _, line := range lines {
line = strings.TrimSpace(line)
// Skip empty lines and common noise
if line == "" || strings.HasPrefix(line, "```") || strings.HasPrefix(line, "---") {
continue
}
// Found a meaningful line
return safeTruncate(line, maxLen)
}
// Fallback: truncate entire message
clean := strings.TrimSpace(message)
return safeTruncate(clean, maxLen)
}
// extractCoverageFromLines extracts coverage from pre-split lines.
func extractCoverageFromLines(lines []string) string {
if len(lines) == 0 {
return ""
}
end := len(lines)
for end > 0 && strings.TrimSpace(lines[end-1]) == "" {
end--
}
if end == 1 {
trimmed := strings.TrimSpace(lines[0])
if strings.HasSuffix(trimmed, "%") {
if num, err := strconv.ParseFloat(strings.TrimSuffix(trimmed, "%"), 64); err == nil && num >= 0 && num <= 100 {
return trimmed
}
}
}
coverageKeywords := []string{"file", "stmt", "branch", "line", "coverage", "total"}
for _, line := range lines[:end] {
lower := strings.ToLower(line)
hasKeyword := false
tokens := strings.FieldsFunc(lower, func(r rune) bool { return r < 'a' || r > 'z' })
for _, token := range tokens {
for _, kw := range coverageKeywords {
if strings.HasPrefix(token, kw) {
hasKeyword = true
break
}
}
if hasKeyword {
break
}
}
if !hasKeyword {
continue
}
if !strings.Contains(line, "%") {
continue
}
// Extract percentage pattern: number followed by %
for i := 0; i < len(line); i++ {
if line[i] == '%' && i > 0 {
// Walk back to find the number
j := i - 1
for j >= 0 && (line[j] == '.' || (line[j] >= '0' && line[j] <= '9')) {
j--
}
if j < i-1 {
numStr := line[j+1 : i]
// Validate it's a reasonable percentage
if num, err := strconv.ParseFloat(numStr, 64); err == nil && num >= 0 && num <= 100 {
return numStr + "%"
}
}
}
}
}
return ""
}
// extractCoverage extracts coverage percentage from task output
// Supports common formats: "Coverage: 92%", "92% coverage", "coverage 92%", "TOTAL 92%"
func extractCoverage(message string) string {
if message == "" {
return ""
}
return extractCoverageFromLines(strings.Split(message, "\n"))
}
// extractCoverageNum extracts coverage as a numeric value for comparison
func extractCoverageNum(coverage string) float64 {
if coverage == "" {
return 0
}
// Remove % sign and parse
numStr := strings.TrimSuffix(coverage, "%")
if num, err := strconv.ParseFloat(numStr, 64); err == nil {
return num
}
return 0
}
// extractFilesChangedFromLines extracts files from pre-split lines.
func extractFilesChangedFromLines(lines []string) []string {
if len(lines) == 0 {
return nil
}
var files []string
seen := make(map[string]bool)
exts := []string{".ts", ".tsx", ".js", ".jsx", ".go", ".py", ".rs", ".java", ".vue", ".css", ".scss", ".md", ".json", ".yaml", ".yml", ".toml"}
for _, line := range lines {
line = strings.TrimSpace(line)
// Pattern 1: "Modified: path/to/file.ts" or "Created: path/to/file.ts"
matchedPrefix := false
for _, prefix := range []string{"Modified:", "Created:", "Updated:", "Edited:", "Wrote:", "Changed:"} {
if strings.HasPrefix(line, prefix) {
file := strings.TrimSpace(strings.TrimPrefix(line, prefix))
file = strings.Trim(file, "`,\"'()[],:")
file = strings.TrimPrefix(file, "@")
if file != "" && !seen[file] {
files = append(files, file)
seen[file] = true
}
matchedPrefix = true
break
}
}
if matchedPrefix {
continue
}
// Pattern 2: Tokens that look like file paths (allow root files, strip @ prefix).
parts := strings.Fields(line)
for _, part := range parts {
part = strings.Trim(part, "`,\"'()[],:")
part = strings.TrimPrefix(part, "@")
for _, ext := range exts {
if strings.HasSuffix(part, ext) && !seen[part] {
files = append(files, part)
seen[part] = true
break
}
}
}
}
// Limit to first 10 files to avoid bloat
if len(files) > 10 {
files = files[:10]
}
return files
}
// extractFilesChanged extracts list of changed files from task output
// Looks for common patterns like "Modified: file.ts", "Created: file.ts", file paths in output
func extractFilesChanged(message string) []string {
if message == "" {
return nil
}
return extractFilesChangedFromLines(strings.Split(message, "\n"))
}
// extractTestResultsFromLines extracts test results from pre-split lines.
func extractTestResultsFromLines(lines []string) (passed, failed int) {
if len(lines) == 0 {
return 0, 0
}
// Common patterns:
// pytest: "12 passed, 2 failed"
// jest: "Tests: 2 failed, 12 passed"
// go: "ok ... 12 tests"
for _, line := range lines {
line = strings.ToLower(line)
// Look for test result lines
if !strings.Contains(line, "pass") && !strings.Contains(line, "fail") && !strings.Contains(line, "test") {
continue
}
// Extract numbers near "passed" or "pass"
if idx := strings.Index(line, "pass"); idx != -1 {
// Look for number before "pass"
num := extractNumberBefore(line, idx)
if num > 0 {
passed = num
}
}
// Extract numbers near "failed" or "fail"
if idx := strings.Index(line, "fail"); idx != -1 {
num := extractNumberBefore(line, idx)
if num > 0 {
failed = num
}
}
// go test style: "ok ... 12 tests"
if passed == 0 {
if idx := strings.Index(line, "test"); idx != -1 {
num := extractNumberBefore(line, idx)
if num > 0 {
passed = num
}
}
}
// If we found both, stop
if passed > 0 && failed > 0 {
break
}
}
return passed, failed
}
// extractTestResults extracts test pass/fail counts from task output
func extractTestResults(message string) (passed, failed int) {
if message == "" {
return 0, 0
}
return extractTestResultsFromLines(strings.Split(message, "\n"))
}
// extractNumberBefore extracts a number that appears before the given index
func extractNumberBefore(s string, idx int) int {
if idx <= 0 {
return 0
}
// Walk backwards to find digits
end := idx - 1
for end >= 0 && (s[end] == ' ' || s[end] == ':' || s[end] == ',') {
end--
}
if end < 0 {
return 0
}
start := end
for start >= 0 && s[start] >= '0' && s[start] <= '9' {
start--
}
start++
if start > end {
return 0
}
numStr := s[start : end+1]
if num, err := strconv.Atoi(numStr); err == nil {
return num
}
return 0
}
// extractKeyOutputFromLines extracts key output from pre-split lines.
func extractKeyOutputFromLines(lines []string, maxLen int) string {
if len(lines) == 0 || maxLen <= 0 {
return ""
}
// Priority 1: Look for explicit summary lines
for _, line := range lines {
line = strings.TrimSpace(line)
lower := strings.ToLower(line)
if strings.HasPrefix(lower, "summary:") || strings.HasPrefix(lower, "completed:") ||
strings.HasPrefix(lower, "implemented:") || strings.HasPrefix(lower, "added:") ||
strings.HasPrefix(lower, "created:") || strings.HasPrefix(lower, "fixed:") {
content := line
for _, prefix := range []string{"Summary:", "Completed:", "Implemented:", "Added:", "Created:", "Fixed:",
"summary:", "completed:", "implemented:", "added:", "created:", "fixed:"} {
content = strings.TrimPrefix(content, prefix)
}
content = strings.TrimSpace(content)
if len(content) > 0 {
return safeTruncate(content, maxLen)
}
}
}
// Priority 2: First meaningful line (skip noise)
for _, line := range lines {
line = strings.TrimSpace(line)
if line == "" || strings.HasPrefix(line, "```") || strings.HasPrefix(line, "---") ||
strings.HasPrefix(line, "#") || strings.HasPrefix(line, "//") {
continue
}
// Skip very short lines (likely headers or markers)
if len(line) < 20 {
continue
}
return safeTruncate(line, maxLen)
}
// Fallback: truncate entire message
clean := strings.TrimSpace(strings.Join(lines, "\n"))
return safeTruncate(clean, maxLen)
}
// extractKeyOutput extracts a brief summary of what the task accomplished
// Looks for summary lines, first meaningful sentence, or truncates message
func extractKeyOutput(message string, maxLen int) string {
if message == "" || maxLen <= 0 {
return ""
}
return extractKeyOutputFromLines(strings.Split(message, "\n"), maxLen)
}
// extractCoverageGap extracts what's missing from coverage reports
// Looks for uncovered lines, branches, or functions
func extractCoverageGap(message string) string {
if message == "" {
return ""
}
lower := strings.ToLower(message)
lines := strings.Split(message, "\n")
// Look for uncovered/missing patterns
for _, line := range lines {
lineLower := strings.ToLower(line)
line = strings.TrimSpace(line)
// Common patterns for uncovered code
if strings.Contains(lineLower, "uncovered") ||
strings.Contains(lineLower, "not covered") ||
strings.Contains(lineLower, "missing coverage") ||
strings.Contains(lineLower, "lines not covered") {
if len(line) > 100 {
return line[:97] + "..."
}
return line
}
// Look for specific file:line patterns in coverage reports
if strings.Contains(lineLower, "branch") && strings.Contains(lineLower, "not taken") {
if len(line) > 100 {
return line[:97] + "..."
}
return line
}
}
// Look for function names that aren't covered
if strings.Contains(lower, "function") && strings.Contains(lower, "0%") {
for _, line := range lines {
if strings.Contains(strings.ToLower(line), "0%") && strings.Contains(line, "function") {
line = strings.TrimSpace(line)
if len(line) > 100 {
return line[:97] + "..."
}
return line
}
}
}
return ""
}
// extractErrorDetail extracts meaningful error context from task output
// Returns the most relevant error information up to maxLen characters
func extractErrorDetail(message string, maxLen int) string {
if message == "" || maxLen <= 0 {
return ""
}
lines := strings.Split(message, "\n")
var errorLines []string
// Look for error-related lines
for _, line := range lines {
line = strings.TrimSpace(line)
if line == "" {
continue
}
lower := strings.ToLower(line)
// Skip noise lines
if strings.HasPrefix(line, "at ") && strings.Contains(line, "(") {
// Stack trace line - only keep first one
if len(errorLines) > 0 && strings.HasPrefix(strings.ToLower(errorLines[len(errorLines)-1]), "at ") {
continue
}
}
// Prioritize error/fail lines
if strings.Contains(lower, "error") ||
strings.Contains(lower, "fail") ||
strings.Contains(lower, "exception") ||
strings.Contains(lower, "assert") ||
strings.Contains(lower, "expected") ||
strings.Contains(lower, "timeout") ||
strings.Contains(lower, "not found") ||
strings.Contains(lower, "cannot") ||
strings.Contains(lower, "undefined") ||
strings.HasPrefix(line, "FAIL") ||
strings.HasPrefix(line, "●") {
errorLines = append(errorLines, line)
}
}
if len(errorLines) == 0 {
// No specific error lines found, take last few lines
start := len(lines) - 5
if start < 0 {
start = 0
}
for _, line := range lines[start:] {
line = strings.TrimSpace(line)
if line != "" {
errorLines = append(errorLines, line)
}
}
}
// Join and truncate
result := strings.Join(errorLines, " | ")
return safeTruncate(result, maxLen)
}

View File

@@ -0,0 +1,143 @@
package main
import (
"fmt"
"reflect"
"strings"
"testing"
)
func TestExtractCoverage(t *testing.T) {
tests := []struct {
name string
in string
want string
}{
{"bare int", "92%", "92%"},
{"bare float", "92.5%", "92.5%"},
{"coverage prefix", "coverage: 92%", "92%"},
{"total prefix", "TOTAL 92%", "92%"},
{"all files", "All files 92%", "92%"},
{"empty", "", ""},
{"no number", "coverage: N/A", ""},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := extractCoverage(tt.in); got != tt.want {
t.Fatalf("extractCoverage(%q) = %q, want %q", tt.in, got, tt.want)
}
})
}
}
func TestExtractTestResults(t *testing.T) {
tests := []struct {
name string
in string
wantPassed int
wantFailed int
}{
{"pytest one line", "12 passed, 2 failed", 12, 2},
{"pytest split lines", "12 passed\n2 failed", 12, 2},
{"jest format", "Tests: 2 failed, 12 passed, 14 total", 12, 2},
{"go test style count", "ok\texample.com/foo\t0.12s\t12 tests", 12, 0},
{"zero counts", "0 passed, 0 failed", 0, 0},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
passed, failed := extractTestResults(tt.in)
if passed != tt.wantPassed || failed != tt.wantFailed {
t.Fatalf("extractTestResults(%q) = (%d, %d), want (%d, %d)", tt.in, passed, failed, tt.wantPassed, tt.wantFailed)
}
})
}
}
func TestExtractFilesChanged(t *testing.T) {
tests := []struct {
name string
in string
want []string
}{
{"root file", "Modified: main.go\n", []string{"main.go"}},
{"path file", "Created: codeagent-wrapper/utils.go\n", []string{"codeagent-wrapper/utils.go"}},
{"at prefix", "Updated: @codeagent-wrapper/main.go\n", []string{"codeagent-wrapper/main.go"}},
{"token scan", "Files: @main.go, @codeagent-wrapper/utils.go\n", []string{"main.go", "codeagent-wrapper/utils.go"}},
{"space path", "Modified: dir/with space/file.go\n", []string{"dir/with space/file.go"}},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := extractFilesChanged(tt.in); !reflect.DeepEqual(got, tt.want) {
t.Fatalf("extractFilesChanged(%q) = %#v, want %#v", tt.in, got, tt.want)
}
})
}
t.Run("limits to first 10", func(t *testing.T) {
var b strings.Builder
for i := 0; i < 12; i++ {
fmt.Fprintf(&b, "Modified: file%d.go\n", i)
}
got := extractFilesChanged(b.String())
if len(got) != 10 {
t.Fatalf("len(files)=%d, want 10: %#v", len(got), got)
}
for i := 0; i < 10; i++ {
want := fmt.Sprintf("file%d.go", i)
if got[i] != want {
t.Fatalf("files[%d]=%q, want %q", i, got[i], want)
}
}
})
}
func TestSafeTruncate(t *testing.T) {
tests := []struct {
name string
in string
maxLen int
want string
}{
{"empty", "", 4, ""},
{"zero maxLen", "hello", 0, ""},
{"one rune", "你好", 1, "你"},
{"two runes no truncate", "你好", 2, "你好"},
{"three runes no truncate", "你好", 3, "你好"},
{"two runes truncates long", "你好世界", 2, "你"},
{"three runes truncates long", "你好世界", 3, "你"},
{"four with ellipsis", "你好世界啊", 4, "你..."},
{"emoji", "🙂🙂🙂🙂🙂", 4, "🙂..."},
{"no truncate", "你好世界", 4, "你好世界"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := safeTruncate(tt.in, tt.maxLen); got != tt.want {
t.Fatalf("safeTruncate(%q, %d) = %q, want %q", tt.in, tt.maxLen, got, tt.want)
}
})
}
}
func TestSanitizeOutput(t *testing.T) {
tests := []struct {
name string
in string
want string
}{
{"ansi", "\x1b[31mred\x1b[0m", "red"},
{"control chars", "a\x07b\r\nc\t", "ab\nc\t"},
{"normal", "hello\nworld\t!", "hello\nworld\t!"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if got := sanitizeOutput(tt.in); got != tt.want {
t.Fatalf("sanitizeOutput(%q) = %q, want %q", tt.in, got, tt.want)
}
})
}
}

View File

@@ -9,42 +9,56 @@ A freshly designed lightweight development workflow with no legacy baggage, focu
``` ```
/dev trigger /dev trigger
AskUserQuestion (backend selection)
AskUserQuestion (requirements clarification) AskUserQuestion (requirements clarification)
codeagent analysis (plan mode + UI auto-detection) codeagent analysis (plan mode + task typing + UI auto-detection)
dev-plan-generator (create dev doc) dev-plan-generator (create dev doc)
codeagent concurrent development (25 tasks, backend split) codeagent concurrent development (25 tasks, backend routing)
codeagent testing & verification (≥90% coverage) codeagent testing & verification (≥90% coverage)
Done (generate summary) Done (generate summary)
``` ```
## The 6 Steps ## Step 0 + The 6 Steps
### 0. Select Allowed Backends (FIRST ACTION)
- Use **AskUserQuestion** with multiSelect to ask which backends are allowed for this run
- Options (user can select multiple):
- `codex` - Stable, high quality, best cost-performance (default for most tasks)
- `claude` - Fast, lightweight (for quick fixes and config changes)
- `gemini` - UI/UX specialist (for frontend styling and components)
- If user selects ONLY `codex`, ALL subsequent tasks must use `codex` (including UI/quick-fix)
### 1. Clarify Requirements ### 1. Clarify Requirements
- Use **AskUserQuestion** to ask the user directly - Use **AskUserQuestion** to ask the user directly
- No scoring system, no complex logic - No scoring system, no complex logic
- 23 rounds of Q&A until the requirement is clear - 23 rounds of Q&A until the requirement is clear
### 2. codeagent Analysis & UI Detection ### 2. codeagent Analysis + Task Typing + UI Detection
- Call codeagent to analyze the request in plan mode style - Call codeagent to analyze the request in plan mode style
- Extract: core functions, technical points, task list (25 items) - Extract: core functions, technical points, task list (25 items)
- For each task, assign exactly one type: `default` / `ui` / `quick-fix`
- UI auto-detection: needs UI work when task involves style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue); output yes/no plus evidence - UI auto-detection: needs UI work when task involves style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue); output yes/no plus evidence
### 3. Generate Dev Doc ### 3. Generate Dev Doc
- Call the **dev-plan-generator** agent - Call the **dev-plan-generator** agent
- Produce a single `dev-plan.md` - Produce a single `dev-plan.md`
- Append a dedicated UI task when Step 2 marks `needs_ui: true` - Append a dedicated UI task when Step 2 marks `needs_ui: true`
- Include: task breakdown, file scope, dependencies, test commands - Include: task breakdown, `type`, file scope, dependencies, test commands
### 4. Concurrent Development ### 4. Concurrent Development
- Work from the task list in dev-plan.md - Work from the task list in dev-plan.md
- Use codeagent per task with explicit backend selection: - Route backend per task type (with user constraints + fallback):
- Backend/API/DB tasks → `--backend codex` (default) - `default``codex`
- UI/style/component tasks → `--backend gemini` (enforced) - `ui``gemini` (enforced when allowed)
- `quick-fix``claude`
- Missing `type` → treat as `default`
- If the preferred backend is not allowed, fallback to an allowed backend by priority: `codex``claude``gemini`
- Independent tasks → run in parallel - Independent tasks → run in parallel
- Conflicting tasks → run serially - Conflicting tasks → run serially
@@ -65,7 +79,7 @@ Done (generate summary)
/dev "Implement user login with email + password" /dev "Implement user login with email + password"
``` ```
**No options**, fixed workflow, works out of the box. No CLI flags required; workflow starts with an interactive backend selection.
## Output Structure ## Output Structure
@@ -80,14 +94,14 @@ Only one file—minimal and clear.
### Tools ### Tools
- **AskUserQuestion**: interactive requirement clarification - **AskUserQuestion**: interactive requirement clarification
- **codeagent skill**: analysis, development, testing; supports `--backend` for codex (default) or gemini (UI) - **codeagent skill**: analysis, development, testing; supports `--backend` for `codex` / `claude` / `gemini`
- **dev-plan-generator agent**: generate dev doc (subagent via Task tool, saves context) - **dev-plan-generator agent**: generate dev doc (subagent via Task tool, saves context)
## UI Auto-Detection & Backend Routing ## Backend Selection & Routing
- **Step 0**: user selects allowed backends; if `仅 codex`, all tasks use codex
- **UI detection standard**: style files (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component code (.tsx, .jsx, .vue) trigger `needs_ui: true` - **UI detection standard**: style files (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component code (.tsx, .jsx, .vue) trigger `needs_ui: true`
- **Flow impact**: Step 2 auto-detects UI work; Step 3 appends a separate UI task in `dev-plan.md` when detected - **Task type field**: each task in `dev-plan.md` must have `type: default|ui|quick-fix`
- **Backend split**: backend/API tasks use codex backend (default); UI tasks force gemini backend - **Routing**: `default`→codex, `ui`→gemini, `quick-fix`→claude; if disallowed, fallback to an allowed backend by priority: codex→claude→gemini
- **Implementation**: Orchestrator invokes codeagent skill with appropriate backend parameter per task type
## Key Features ## Key Features
@@ -102,9 +116,9 @@ Only one file—minimal and clear.
- Steps are straightforward - Steps are straightforward
### ✅ Concurrency ### ✅ Concurrency
- 25 tasks in parallel - Tasks split based on natural functional boundaries
- Auto-detect dependencies and conflicts - Auto-detect dependencies and conflicts
- codeagent executes independently - codeagent executes independently with optimal backend
### ✅ Quality Assurance ### ✅ Quality Assurance
- Enforces 90% coverage - Enforces 90% coverage
@@ -117,6 +131,10 @@ Only one file—minimal and clear.
# Trigger # Trigger
/dev "Add user login feature" /dev "Add user login feature"
# Step 0: Select backends
Q: Which backends are allowed? (multiSelect)
A: Selected: codex, claude
# Step 1: Clarify requirements # Step 1: Clarify requirements
Q: What login methods are supported? Q: What login methods are supported?
A: Email + password A: Email + password
@@ -126,18 +144,18 @@ A: Yes, use JWT token
# Step 2: codeagent analysis # Step 2: codeagent analysis
Output: Output:
- Core: email/password login + JWT auth - Core: email/password login + JWT auth
- Task 1: Backend API - Task 1: Backend API (type=default)
- Task 2: Password hashing - Task 2: Password hashing (type=default)
- Task 3: Frontend form - Task 3: Frontend form (type=ui)
UI detection: needs_ui = true (tailwindcss classes in frontend form) UI detection: needs_ui = true (tailwindcss classes in frontend form)
# Step 3: Generate doc # Step 3: Generate doc
dev-plan.md generated with backend + UI tasks ✓ dev-plan.md generated with typed tasks ✓
# Step 4-5: Concurrent development (backend codex, UI gemini) # Step 4-5: Concurrent development (routing + fallback)
[task-1] Backend API (codex) → tests → 92% ✓ [task-1] Backend API (codex) → tests → 92% ✓
[task-2] Password hashing (codex) → tests → 95% ✓ [task-2] Password hashing (codex) → tests → 95% ✓
[task-3] Frontend form (gemini) → tests → 91% ✓ [task-3] Frontend form (fallback to codex; gemini not allowed) → tests → 91% ✓
``` ```
## Directory Structure ## Directory Structure

View File

@@ -12,7 +12,7 @@ You are a specialized Development Plan Document Generator. Your sole responsibil
You receive context from an orchestrator including: You receive context from an orchestrator including:
- Feature requirements description - Feature requirements description
- codeagent analysis results (feature highlights, task decomposition, UI detection flag) - codeagent analysis results (feature highlights, task decomposition, UI detection flag, and task typing hints)
- Feature name (in kebab-case format) - Feature name (in kebab-case format)
Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md` Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
@@ -29,6 +29,7 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
### Task 1: [Task Name] ### Task 1: [Task Name]
- **ID**: task-1 - **ID**: task-1
- **type**: default|ui|quick-fix
- **Description**: [What needs to be done] - **Description**: [What needs to be done]
- **File Scope**: [Directories or files involved, e.g., src/auth/**, tests/auth/] - **File Scope**: [Directories or files involved, e.g., src/auth/**, tests/auth/]
- **Dependencies**: [None or depends on task-x] - **Dependencies**: [None or depends on task-x]
@@ -38,7 +39,7 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
### Task 2: [Task Name] ### Task 2: [Task Name]
... ...
(2-5 tasks) (Tasks based on natural functional boundaries, typically 2-5)
## Acceptance Criteria ## Acceptance Criteria
- [ ] Feature point 1 - [ ] Feature point 1
@@ -53,9 +54,13 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
## Generation Rules You Must Enforce ## Generation Rules You Must Enforce
1. **Task Count**: Generate 2-5 tasks (no more, no less unless the feature is extremely simple or complex) 1. **Task Count**: Generate tasks based on natural functional boundaries (no artificial limits)
- Typical range: 2-5 tasks
- Quality over quantity: prefer fewer well-scoped tasks over excessive fragmentation
- Each task should be independently completable by one agent
2. **Task Requirements**: Each task MUST include: 2. **Task Requirements**: Each task MUST include:
- Clear ID (task-1, task-2, etc.) - Clear ID (task-1, task-2, etc.)
- A single task type field: `type: default|ui|quick-fix`
- Specific description of what needs to be done - Specific description of what needs to be done
- Explicit file scope (directories or files affected) - Explicit file scope (directories or files affected)
- Dependency declaration ("None" or "depends on task-x") - Dependency declaration ("None" or "depends on task-x")
@@ -67,18 +72,23 @@ Your output is a single file: `./.claude/specs/{feature_name}/dev-plan.md`
## Your Workflow ## Your Workflow
1. **Analyze Input**: Review the requirements description and codeagent analysis results (including `needs_ui` flag if present) 1. **Analyze Input**: Review the requirements description and codeagent analysis results (including `needs_ui` and any task typing hints)
2. **Identify Tasks**: Break down the feature into 2-5 logical, independent tasks 2. **Identify Tasks**: Break down the feature into 2-5 logical, independent tasks
3. **Determine Dependencies**: Map out which tasks depend on others (minimize dependencies) 3. **Determine Dependencies**: Map out which tasks depend on others (minimize dependencies)
4. **Specify Testing**: For each task, define the exact test command and coverage requirements 4. **Assign Task Type**: For each task, set exactly one `type`:
5. **Define Acceptance**: List concrete, measurable acceptance criteria including the 90% coverage requirement - `ui`: touches UI/style/component work (e.g., .css/.scss/.tsx/.jsx/.vue, tailwind, design tweaks)
6. **Document Technical Points**: Note key technical decisions and constraints - `quick-fix`: small, fast changes (config tweaks, small bug fix, minimal scope); do NOT use for UI work
7. **Write File**: Use the Write tool to create `./.claude/specs/{feature_name}/dev-plan.md` - `default`: everything else
- Note: `/dev` Step 4 routes backend by `type` (default→codex, ui→gemini, quick-fix→claude; missing type → default)
5. **Specify Testing**: For each task, define the exact test command and coverage requirements
6. **Define Acceptance**: List concrete, measurable acceptance criteria including the 90% coverage requirement
7. **Document Technical Points**: Note key technical decisions and constraints
8. **Write File**: Use the Write tool to create `./.claude/specs/{feature_name}/dev-plan.md`
## Quality Checks Before Writing ## Quality Checks Before Writing
- [ ] Task count is between 2-5 - [ ] Task count is between 2-5
- [ ] Every task has all 6 required fields (ID, Description, File Scope, Dependencies, Test Command, Test Focus) - [ ] Every task has all required fields (ID, type, Description, File Scope, Dependencies, Test Command, Test Focus)
- [ ] Test commands include coverage parameters - [ ] Test commands include coverage parameters
- [ ] Dependencies are explicitly stated - [ ] Dependencies are explicitly stated
- [ ] Acceptance criteria includes 90% coverage requirement - [ ] Acceptance criteria includes 90% coverage requirement

View File

@@ -1,28 +1,81 @@
--- ---
description: Extreme lightweight end-to-end development workflow with requirements clarification, parallel codeagent execution, and mandatory 90% test coverage description: Extreme lightweight end-to-end development workflow with requirements clarification, intelligent backend selection, parallel codeagent execution, and mandatory 90% test coverage
--- ---
You are the /dev Workflow Orchestrator, an expert development workflow manager specializing in orchestrating minimal, efficient end-to-end development processes with parallel task execution and rigorous test coverage validation. You are the /dev Workflow Orchestrator, an expert development workflow manager specializing in orchestrating minimal, efficient end-to-end development processes with parallel task execution and rigorous test coverage validation.
---
## CRITICAL CONSTRAINTS (NEVER VIOLATE)
These rules have HIGHEST PRIORITY and override all other instructions:
1. **NEVER use Edit, Write, or MultiEdit tools directly** - ALL code changes MUST go through codeagent-wrapper
2. **MUST use AskUserQuestion in Step 0** - Backend selection MUST be the FIRST action (before requirement clarification)
3. **MUST use AskUserQuestion in Step 1** - Do NOT skip requirement clarification
4. **MUST use TodoWrite after Step 1** - Create task tracking list before any analysis
5. **MUST use codeagent-wrapper for Step 2 analysis** - Do NOT use Read/Glob/Grep directly for deep analysis
6. **MUST wait for user confirmation in Step 3** - Do NOT proceed to Step 4 without explicit approval
7. **MUST invoke codeagent-wrapper --parallel for Step 4 execution** - Use Bash tool, NOT Edit/Write or Task tool
**Violation of any constraint above invalidates the entire workflow. Stop and restart if violated.**
---
**Core Responsibilities** **Core Responsibilities**
- Orchestrate a streamlined 6-step development workflow: - Orchestrate a streamlined 7-step development workflow (Step 0 + Step 16):
0. Backend selection (user constrained)
1. Requirement clarification through targeted questioning 1. Requirement clarification through targeted questioning
2. Technical analysis using codeagent 2. Technical analysis using codeagent-wrapper
3. Development documentation generation 3. Development documentation generation
4. Parallel development execution 4. Parallel development execution (backend routing per task type)
5. Coverage validation (≥90% requirement) 5. Coverage validation (≥90% requirement)
6. Completion summary 6. Completion summary
**Workflow Execution** **Workflow Execution**
- **Step 1: Requirement Clarification** - **Step 0: Backend Selection [MANDATORY - FIRST ACTION]**
- Use AskUserQuestion to clarify requirements directly - MUST use AskUserQuestion tool as the FIRST action with multiSelect enabled
- Ask which backends are allowed for this /dev run
- Options (user can select multiple):
- `codex` - Stable, high quality, best cost-performance (default for most tasks)
- `claude` - Fast, lightweight (for quick fixes and config changes)
- `gemini` - UI/UX specialist (for frontend styling and components)
- Store the selected backends as `allowed_backends` set for routing in Step 4
- Special rule: if user selects ONLY `codex`, then ALL subsequent tasks (including UI/quick-fix) MUST use `codex` (no exceptions)
- **Step 1: Requirement Clarification [MANDATORY - DO NOT SKIP]**
- MUST use AskUserQuestion tool
- Focus questions on functional boundaries, inputs/outputs, constraints, testing, and required unit-test coverage levels - Focus questions on functional boundaries, inputs/outputs, constraints, testing, and required unit-test coverage levels
- Iterate 2-3 rounds until clear; rely on judgment; keep questions concise - Iterate 2-3 rounds until clear; rely on judgment; keep questions concise
- After clarification complete: MUST use TodoWrite to create task tracking list with workflow steps
- **Step 2: codeagent Deep Analysis (Plan Mode Style)** - **Step 2: codeagent-wrapper Deep Analysis (Plan Mode Style) [USE CODEAGENT-WRAPPER ONLY]**
Use codeagent Skill to perform deep analysis. codeagent should operate in "plan mode" style and must include UI detection: MUST use Bash tool to invoke `codeagent-wrapper` for deep analysis. Do NOT use Read/Glob/Grep tools directly - delegate all exploration to codeagent-wrapper.
**How to invoke for analysis**:
```bash
# analysis_backend selection:
# - prefer codex if it is in allowed_backends
# - otherwise pick the first backend in allowed_backends
codeagent-wrapper --backend {analysis_backend} - <<'EOF'
Analyze the codebase for implementing [feature name].
Requirements:
- [requirement 1]
- [requirement 2]
Deliverables:
1. Explore codebase structure and existing patterns
2. Evaluate implementation options with trade-offs
3. Make architectural decisions
4. Break down into 2-5 parallelizable tasks with dependencies and file scope
5. Classify each task with a single `type`: `default` / `ui` / `quick-fix`
6. Determine if UI work is needed (check for .css/.tsx/.vue files)
Output the analysis following the structure below.
EOF
```
**When Deep Analysis is Needed** (any condition triggers): **When Deep Analysis is Needed** (any condition triggers):
- Multiple valid approaches exist (e.g., Redis vs in-memory vs file-based caching) - Multiple valid approaches exist (e.g., Redis vs in-memory vs file-based caching)
@@ -34,12 +87,12 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
- During analysis, output whether the task needs UI work (yes/no) and the evidence - During analysis, output whether the task needs UI work (yes/no) and the evidence
- UI criteria: presence of style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue) - UI criteria: presence of style assets (.css, .scss, styled-components, CSS modules, tailwindcss) OR frontend component files (.tsx, .jsx, .vue)
**What codeagent Does in Analysis Mode**: **What the AI backend does in Analysis Mode** (when invoked via codeagent-wrapper):
1. **Explore Codebase**: Use Glob, Grep, Read to understand structure, patterns, architecture 1. **Explore Codebase**: Use Glob, Grep, Read to understand structure, patterns, architecture
2. **Identify Existing Patterns**: Find how similar features are implemented, reuse conventions 2. **Identify Existing Patterns**: Find how similar features are implemented, reuse conventions
3. **Evaluate Options**: When multiple approaches exist, list trade-offs (complexity, performance, security, maintainability) 3. **Evaluate Options**: When multiple approaches exist, list trade-offs (complexity, performance, security, maintainability)
4. **Make Architectural Decisions**: Choose patterns, APIs, data models with justification 4. **Make Architectural Decisions**: Choose patterns, APIs, data models with justification
5. **Design Task Breakdown**: Produce 2-5 parallelizable tasks with file scope and dependencies 5. **Design Task Breakdown**: Produce parallelizable tasks based on natural functional boundaries with file scope and dependencies
**Analysis Output Structure**: **Analysis Output Structure**:
``` ```
@@ -56,7 +109,7 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
[API design, data models, architecture choices made] [API design, data models, architecture choices made]
## Task Breakdown ## Task Breakdown
[2-5 tasks with: ID, description, file scope, dependencies, test command] [2-5 tasks with: ID, description, file scope, dependencies, test command, type(default|ui|quick-fix)]
## UI Determination ## UI Determination
needs_ui: [true/false] needs_ui: [true/false]
@@ -70,39 +123,62 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
- **Step 3: Generate Development Documentation** - **Step 3: Generate Development Documentation**
- invoke agent dev-plan-generator - invoke agent dev-plan-generator
- When creating `dev-plan.md`, append a dedicated UI task if Step 2 marked `needs_ui: true` - When creating `dev-plan.md`, ensure every task has `type: default|ui|quick-fix`
- Append a dedicated UI task if Step 2 marked `needs_ui: true` but no UI task exists
- Output a brief summary of dev-plan.md: - Output a brief summary of dev-plan.md:
- Number of tasks and their IDs - Number of tasks and their IDs
- Task type for each task
- File scope for each task - File scope for each task
- Dependencies between tasks - Dependencies between tasks
- Test commands - Test commands
- Use AskUserQuestion to confirm with user: - Use AskUserQuestion to confirm with user:
- Question: "Proceed with this development plan?" (if UI work is detected, state that UI tasks will use the gemini backend) - Question: "Proceed with this development plan?" (state backend routing rules and any forced fallback due to allowed_backends)
- Options: "Confirm and execute" / "Need adjustments" - Options: "Confirm and execute" / "Need adjustments"
- If user chooses "Need adjustments", return to Step 1 or Step 2 based on feedback - If user chooses "Need adjustments", return to Step 1 or Step 2 based on feedback
- **Step 4: Parallel Development Execution** - **Step 4: Parallel Development Execution [CODEAGENT-WRAPPER ONLY - NO DIRECT EDITS]**
- For each task in `dev-plan.md`, invoke codeagent skill with task brief in HEREDOC format: - MUST use Bash tool to invoke `codeagent-wrapper --parallel` for ALL code changes
- NEVER use Edit, Write, MultiEdit, or Task tools to modify code directly
- Backend routing (must be deterministic and enforceable):
- Task field: `type: default|ui|quick-fix` (missing → treat as `default`)
- Preferred backend by type:
- `default` → `codex`
- `ui` → `gemini` (enforced when allowed)
- `quick-fix` → `claude`
- If user selected `仅 codex`: all tasks MUST use `codex`
- Otherwise, if preferred backend is not in `allowed_backends`, fallback to the first available backend by priority: `codex` → `claude` → `gemini`
- Build ONE `--parallel` config that includes all tasks in `dev-plan.md` and submit it once via Bash tool:
```bash ```bash
# Backend task (use codex backend - default) # One shot submission - wrapper handles topology + concurrency
codeagent-wrapper --backend codex - <<'EOF' codeagent-wrapper --parallel <<'EOF'
Task: [task-id] ---TASK---
id: [task-id-1]
backend: [routed-backend-from-type-and-allowed_backends]
workdir: .
dependencies: [optional, comma-separated ids]
---CONTENT---
Task: [task-id-1]
Reference: @.claude/specs/{feature_name}/dev-plan.md Reference: @.claude/specs/{feature_name}/dev-plan.md
Scope: [task file scope] Scope: [task file scope]
Test: [test command] Test: [test command]
Deliverables: code + unit tests + coverage ≥90% + coverage summary Deliverables: code + unit tests + coverage ≥90% + coverage summary
EOF
# UI task (use gemini backend - enforced) ---TASK---
codeagent-wrapper --backend gemini - <<'EOF' id: [task-id-2]
Task: [task-id] backend: [routed-backend-from-type-and-allowed_backends]
workdir: .
dependencies: [optional, comma-separated ids]
---CONTENT---
Task: [task-id-2]
Reference: @.claude/specs/{feature_name}/dev-plan.md Reference: @.claude/specs/{feature_name}/dev-plan.md
Scope: [task file scope] Scope: [task file scope]
Test: [test command] Test: [test command]
Deliverables: code + unit tests + coverage ≥90% + coverage summary Deliverables: code + unit tests + coverage ≥90% + coverage summary
EOF EOF
``` ```
- **Note**: Use `workdir: .` (current directory) for all tasks unless specific subdirectory is required
- Execute independent tasks concurrently; serialize conflicting ones; track coverage reports - Execute independent tasks concurrently; serialize conflicting ones; track coverage reports
- Backend is routed deterministically based on task `type`, no manual intervention needed
- **Step 5: Coverage Validation** - **Step 5: Coverage Validation**
- Validate each tasks coverage: - Validate each tasks coverage:
@@ -113,13 +189,19 @@ You are the /dev Workflow Orchestrator, an expert development workflow manager s
- Provide completed task list, coverage per task, key file changes - Provide completed task list, coverage per task, key file changes
**Error Handling** **Error Handling**
- codeagent failure: retry once, then log and continue - **codeagent-wrapper failure**: Retry once with same input; if still fails, log error and ask user for guidance
- Insufficient coverage: request more tests (max 2 rounds) - **Insufficient coverage (<90%)**: Request more tests from the failed task (max 2 rounds); if still fails, report to user
- Dependency conflicts: serialize automatically - **Dependency conflicts**:
- Circular dependencies: codeagent-wrapper will detect and fail with error; revise task breakdown to remove cycles
- Missing dependencies: Ensure all task IDs referenced in `dependencies` field exist
- **Parallel execution timeout**: Individual tasks timeout after 2 hours (configurable via CODEX_TIMEOUT); failed tasks can be retried individually
- **Backend unavailable**: If a routed backend is unavailable, fallback to another backend in `allowed_backends` (priority: codex → claude → gemini); if none works, fail with a clear error message
**Quality Standards** **Quality Standards**
- Code coverage ≥90% - Code coverage ≥90%
- 2-5 genuinely parallelizable tasks - Tasks based on natural functional boundaries (typically 2-5)
- Each task has exactly one `type: default|ui|quick-fix`
- Backend routed by `type`: `default`→codex, `ui`→gemini, `quick-fix`→claude (with allowed_backends fallback)
- Documentation must be minimal yet actionable - Documentation must be minimal yet actionable
- No verbose implementations; only essential code - No verbose implementations; only essential code

View File

@@ -105,6 +105,7 @@ EOF
Execute multiple tasks concurrently with dependency management: Execute multiple tasks concurrently with dependency management:
```bash ```bash
# Default: summary output (context-efficient, recommended)
codeagent-wrapper --parallel <<'EOF' codeagent-wrapper --parallel <<'EOF'
---TASK--- ---TASK---
id: backend_1701234567 id: backend_1701234567
@@ -125,6 +126,47 @@ dependencies: backend_1701234567, frontend_1701234568
---CONTENT--- ---CONTENT---
add integration tests for user management flow add integration tests for user management flow
EOF EOF
# Full output mode (for debugging, includes complete task messages)
codeagent-wrapper --parallel --full-output <<'EOF'
...
EOF
```
**Output Modes:**
- **Summary (default)**: Structured report with extracted `Did/Files/Tests/Coverage`, plus a short action summary.
- **Full (`--full-output`)**: Complete task messages included. Use only for debugging.
**Summary Output Example:**
```
=== Execution Report ===
3 tasks | 2 passed | 1 failed | 1 below 90%
## Task Results
### backend_api ✓ 92%
Did: Implemented /api/users CRUD endpoints
Files: backend/users.go, backend/router.go
Tests: 12 passed
Log: /tmp/codeagent-xxx.log
### frontend_form ⚠️ 88% (below 90%)
Did: Created login form with validation
Files: frontend/LoginForm.tsx
Tests: 8 passed
Gap: lines not covered: frontend/LoginForm.tsx:42-47
Log: /tmp/codeagent-yyy.log
### integration_tests ✗ FAILED
Exit code: 1
Error: Assertion failed at line 45
Detail: Expected status 200 but got 401
Log: /tmp/codeagent-zzz.log
## Summary
- 2/3 completed successfully
- Fix: integration_tests (Assertion failed at line 45)
- Coverage: frontend_form
``` ```
**Parallel Task Format:** **Parallel Task Format:**

5
go.work Normal file
View File

@@ -0,0 +1,5 @@
go 1.21
use (
./codeagent-wrapper
)

View File

@@ -46,17 +46,23 @@ echo.
echo codeagent-wrapper installed successfully at: echo codeagent-wrapper installed successfully at:
echo %DEST% echo %DEST%
rem Automatically ensure %USERPROFILE%\bin is in the USER (HKCU) PATH rem Ensure %USERPROFILE%\bin is in PATH without duplicating entries
rem 1) Read current user PATH from registry (REG_SZ or REG_EXPAND_SZ) rem 1) Read current user PATH from registry (REG_SZ or REG_EXPAND_SZ)
set "USER_PATH_RAW=" set "USER_PATH_RAW="
set "USER_PATH_TYPE="
for /f "tokens=1,2,*" %%A in ('reg query "HKCU\Environment" /v Path 2^>nul ^| findstr /I /R "^ *Path *REG_"') do ( for /f "tokens=1,2,*" %%A in ('reg query "HKCU\Environment" /v Path 2^>nul ^| findstr /I /R "^ *Path *REG_"') do (
set "USER_PATH_TYPE=%%B"
set "USER_PATH_RAW=%%C" set "USER_PATH_RAW=%%C"
) )
rem Trim leading spaces from USER_PATH_RAW rem Trim leading spaces from USER_PATH_RAW
for /f "tokens=* delims= " %%D in ("!USER_PATH_RAW!") do set "USER_PATH_RAW=%%D" for /f "tokens=* delims= " %%D in ("!USER_PATH_RAW!") do set "USER_PATH_RAW=%%D"
rem 2) Read current system PATH from registry (REG_SZ or REG_EXPAND_SZ)
set "SYS_PATH_RAW="
for /f "tokens=1,2,*" %%A in ('reg query "HKLM\System\CurrentControlSet\Control\Session Manager\Environment" /v Path 2^>nul ^| findstr /I /R "^ *Path *REG_"') do (
set "SYS_PATH_RAW=%%C"
)
rem Trim leading spaces from SYS_PATH_RAW
for /f "tokens=* delims= " %%D in ("!SYS_PATH_RAW!") do set "SYS_PATH_RAW=%%D"
rem Normalize DEST_DIR by removing a trailing backslash if present rem Normalize DEST_DIR by removing a trailing backslash if present
if "!DEST_DIR:~-1!"=="\" set "DEST_DIR=!DEST_DIR:~0,-1!" if "!DEST_DIR:~-1!"=="\" set "DEST_DIR=!DEST_DIR:~0,-1!"
@@ -67,42 +73,63 @@ set "SEARCH_EXP2=;!DEST_DIR!\;"
set "SEARCH_LIT=;!PCT!USERPROFILE!PCT!\bin;" set "SEARCH_LIT=;!PCT!USERPROFILE!PCT!\bin;"
set "SEARCH_LIT2=;!PCT!USERPROFILE!PCT!\bin\;" set "SEARCH_LIT2=;!PCT!USERPROFILE!PCT!\bin\;"
rem Prepare user PATH variants for containment tests rem Prepare PATH variants for containment tests (strip quotes to avoid false negatives)
set "CHECK_RAW=;!USER_PATH_RAW!;" set "USER_PATH_RAW_CLEAN=!USER_PATH_RAW:"=!"
set "USER_PATH_EXP=!USER_PATH_RAW!" set "SYS_PATH_RAW_CLEAN=!SYS_PATH_RAW:"=!"
if defined USER_PATH_EXP call set "USER_PATH_EXP=%%USER_PATH_EXP%%"
set "CHECK_EXP=;!USER_PATH_EXP!;"
rem Check if already present in user PATH (literal or expanded, with/without trailing backslash) set "CHECK_USER_RAW=;!USER_PATH_RAW_CLEAN!;"
set "USER_PATH_EXP=!USER_PATH_RAW_CLEAN!"
if defined USER_PATH_EXP call set "USER_PATH_EXP=%%USER_PATH_EXP%%"
set "USER_PATH_EXP_CLEAN=!USER_PATH_EXP:"=!"
set "CHECK_USER_EXP=;!USER_PATH_EXP_CLEAN!;"
set "CHECK_SYS_RAW=;!SYS_PATH_RAW_CLEAN!;"
set "SYS_PATH_EXP=!SYS_PATH_RAW_CLEAN!"
if defined SYS_PATH_EXP call set "SYS_PATH_EXP=%%SYS_PATH_EXP%%"
set "SYS_PATH_EXP_CLEAN=!SYS_PATH_EXP:"=!"
set "CHECK_SYS_EXP=;!SYS_PATH_EXP_CLEAN!;"
rem Check if already present (literal or expanded, with/without trailing backslash)
set "ALREADY_IN_USERPATH=0" set "ALREADY_IN_USERPATH=0"
echo !CHECK_RAW! | findstr /I /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul && set "ALREADY_IN_USERPATH=1" echo(!CHECK_USER_RAW! | findstr /I /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul && set "ALREADY_IN_USERPATH=1"
if "!ALREADY_IN_USERPATH!"=="0" ( if "!ALREADY_IN_USERPATH!"=="0" (
echo !CHECK_EXP! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" >nul && set "ALREADY_IN_USERPATH=1" echo(!CHECK_USER_EXP! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" >nul && set "ALREADY_IN_USERPATH=1"
)
set "ALREADY_IN_SYSPATH=0"
echo(!CHECK_SYS_RAW! | findstr /I /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul && set "ALREADY_IN_SYSPATH=1"
if "!ALREADY_IN_SYSPATH!"=="0" (
echo(!CHECK_SYS_EXP! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" >nul && set "ALREADY_IN_SYSPATH=1"
) )
if "!ALREADY_IN_USERPATH!"=="1" ( if "!ALREADY_IN_USERPATH!"=="1" (
echo User PATH already includes %%USERPROFILE%%\bin. echo User PATH already includes %%USERPROFILE%%\bin.
) else ( ) else (
rem Not present: append to user PATH using setx without duplicating system PATH if "!ALREADY_IN_SYSPATH!"=="1" (
if defined USER_PATH_RAW ( echo System PATH already includes %%USERPROFILE%%\bin; skipping user PATH update.
set "USER_PATH_NEW=!USER_PATH_RAW!"
if not "!USER_PATH_NEW:~-1!"==";" set "USER_PATH_NEW=!USER_PATH_NEW!;"
set "USER_PATH_NEW=!USER_PATH_NEW!!PCT!USERPROFILE!PCT!\bin"
) else ( ) else (
set "USER_PATH_NEW=!PCT!USERPROFILE!PCT!\bin" rem Not present: append to user PATH
) if defined USER_PATH_RAW (
rem Persist update to HKCU\Environment\Path (user scope) set "USER_PATH_NEW=!USER_PATH_RAW!"
setx PATH "!USER_PATH_NEW!" >nul if not "!USER_PATH_NEW:~-1!"==";" set "USER_PATH_NEW=!USER_PATH_NEW!;"
if errorlevel 1 ( set "USER_PATH_NEW=!USER_PATH_NEW!!PCT!USERPROFILE!PCT!\bin"
echo WARNING: Failed to append %%USERPROFILE%%\bin to your user PATH. ) else (
) else ( set "USER_PATH_NEW=!PCT!USERPROFILE!PCT!\bin"
echo Added %%USERPROFILE%%\bin to your user PATH. )
rem Persist update to HKCU\Environment\Path (user scope)
setx Path "!USER_PATH_NEW!" >nul
if errorlevel 1 (
echo WARNING: Failed to append %%USERPROFILE%%\bin to your user PATH.
) else (
echo Added %%USERPROFILE%%\bin to your user PATH.
)
) )
) )
rem Update current session PATH so codex-wrapper is immediately available rem Update current session PATH so codeagent-wrapper is immediately available
set "CURPATH=;%PATH%;" set "CURPATH=;%PATH%;"
echo !CURPATH! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul set "CURPATH_CLEAN=!CURPATH:"=!"
echo(!CURPATH_CLEAN! | findstr /I /C:"!SEARCH_EXP!" /C:"!SEARCH_EXP2!" /C:"!SEARCH_LIT!" /C:"!SEARCH_LIT2!" >nul
if errorlevel 1 set "PATH=!DEST_DIR!;!PATH!" if errorlevel 1 set "PATH=!DEST_DIR!;!PATH!"
goto :cleanup goto :cleanup

View File

@@ -17,7 +17,10 @@ from datetime import datetime
from pathlib import Path from pathlib import Path
from typing import Any, Dict, Iterable, List, Optional from typing import Any, Dict, Iterable, List, Optional
import jsonschema try:
import jsonschema
except ImportError: # pragma: no cover
jsonschema = None
DEFAULT_INSTALL_DIR = "~/.claude" DEFAULT_INSTALL_DIR = "~/.claude"
@@ -87,6 +90,32 @@ def load_config(path: str) -> Dict[str, Any]:
config_path = Path(path).expanduser().resolve() config_path = Path(path).expanduser().resolve()
config = _load_json(config_path) config = _load_json(config_path)
if jsonschema is None:
print(
"WARNING: python package 'jsonschema' is not installed; "
"skipping config validation. To enable validation run:\n"
" python3 -m pip install jsonschema\n",
file=sys.stderr,
)
if not isinstance(config, dict):
raise ValueError(
f"Config must be a dict, got {type(config).__name__}. "
"Check your config.json syntax."
)
required_keys = ["version", "install_dir", "log_file", "modules"]
missing = [key for key in required_keys if key not in config]
if missing:
missing_str = ", ".join(missing)
raise ValueError(
f"Config missing required keys: {missing_str}. "
"Install jsonschema for better validation: "
"python3 -m pip install jsonschema"
)
return config
schema_candidates = [ schema_candidates = [
config_path.parent / "config.schema.json", config_path.parent / "config.schema.json",
Path(__file__).resolve().with_name("config.schema.json"), Path(__file__).resolve().with_name("config.schema.json"),

View File

@@ -34,23 +34,42 @@ if ! curl -fsSL "$URL" -o /tmp/codeagent-wrapper; then
exit 1 exit 1
fi fi
mkdir -p "$HOME/bin" INSTALL_DIR="${INSTALL_DIR:-$HOME/.claude}"
BIN_DIR="${INSTALL_DIR}/bin"
mkdir -p "$BIN_DIR"
mv /tmp/codeagent-wrapper "$HOME/bin/codeagent-wrapper" mv /tmp/codeagent-wrapper "${BIN_DIR}/codeagent-wrapper"
chmod +x "$HOME/bin/codeagent-wrapper" chmod +x "${BIN_DIR}/codeagent-wrapper"
if "$HOME/bin/codeagent-wrapper" --version >/dev/null 2>&1; then if "${BIN_DIR}/codeagent-wrapper" --version >/dev/null 2>&1; then
echo "codeagent-wrapper installed successfully to ~/bin/codeagent-wrapper" echo "codeagent-wrapper installed successfully to ${BIN_DIR}/codeagent-wrapper"
else else
echo "ERROR: installation verification failed" >&2 echo "ERROR: installation verification failed" >&2
exit 1 exit 1
fi fi
if [[ ":$PATH:" != *":$HOME/bin:"* ]]; then # Auto-add to shell config files with idempotency
if [[ ":${PATH}:" != *":${BIN_DIR}:"* ]]; then
echo "" echo ""
echo "WARNING: ~/bin is not in your PATH" echo "WARNING: ${BIN_DIR} is not in your PATH"
echo "Add this line to your ~/.bashrc or ~/.zshrc:"
echo "" # Detect shell config file
echo " export PATH=\"\$HOME/bin:\$PATH\"" if [ -n "$ZSH_VERSION" ]; then
RC_FILE="$HOME/.zshrc"
else
RC_FILE="$HOME/.bashrc"
fi
# Idempotent add: check if complete export statement already exists
EXPORT_LINE="export PATH=\"${BIN_DIR}:\$PATH\""
if [ -f "$RC_FILE" ] && grep -qF "${EXPORT_LINE}" "$RC_FILE" 2>/dev/null; then
echo " ${BIN_DIR} already in ${RC_FILE}, skipping."
else
echo " Adding to ${RC_FILE}..."
echo "" >> "$RC_FILE"
echo "# Added by myclaude installer" >> "$RC_FILE"
echo "export PATH=\"${BIN_DIR}:\$PATH\"" >> "$RC_FILE"
echo " Done. Run 'source ${RC_FILE}' or restart shell."
fi
echo "" echo ""
fi fi

View File

@@ -74,7 +74,7 @@ codeagent-wrapper --backend gemini "simple task"
- `task` (required): Task description, supports `@file` references - `task` (required): Task description, supports `@file` references
- `working_dir` (optional): Working directory (default: current) - `working_dir` (optional): Working directory (default: current)
- `--backend` (optional): Select AI backend (codex/claude/gemini, default: codex) - `--backend` (optional): Select AI backend (codex/claude/gemini, default: codex)
- **Note**: Claude backend defaults to `--dangerously-skip-permissions` for automation compatibility - **Note**: Claude backend only adds `--dangerously-skip-permissions` when explicitly enabled
## Return Format ## Return Format
@@ -101,11 +101,12 @@ EOF
## Parallel Execution ## Parallel Execution
**With global backend**: **Default (summary mode - context-efficient):**
```bash ```bash
codeagent-wrapper --parallel --backend claude <<'EOF' codeagent-wrapper --parallel <<'EOF'
---TASK--- ---TASK---
id: task1 id: task1
backend: codex
workdir: /path/to/dir workdir: /path/to/dir
---CONTENT--- ---CONTENT---
task content task content
@@ -117,6 +118,17 @@ dependent task
EOF EOF
``` ```
**Full output mode (for debugging):**
```bash
codeagent-wrapper --parallel --full-output <<'EOF'
...
EOF
```
**Output Modes:**
- **Summary (default)**: Structured report with changes, output, verification, and review summary.
- **Full (`--full-output`)**: Complete task messages. Use only when debugging specific failures.
**With per-task backend**: **With per-task backend**:
```bash ```bash
codeagent-wrapper --parallel <<'EOF' codeagent-wrapper --parallel <<'EOF'
@@ -147,9 +159,9 @@ Set `CODEAGENT_MAX_PARALLEL_WORKERS` to limit concurrent tasks (default: unlimit
## Environment Variables ## Environment Variables
- `CODEX_TIMEOUT`: Override timeout in milliseconds (default: 7200000 = 2 hours) - `CODEX_TIMEOUT`: Override timeout in milliseconds (default: 7200000 = 2 hours)
- `CODEAGENT_SKIP_PERMISSIONS`: Control permission checks - `CODEAGENT_SKIP_PERMISSIONS`: Control Claude CLI permission checks
- For **Claude** backend: Set to `true`/`1` to **disable** `--dangerously-skip-permissions` (default: enabled) - For **Claude** backend: Set to `true`/`1` to add `--dangerously-skip-permissions` (default: disabled)
- For **Codex/Gemini** backends: Set to `true`/`1` to enable permission skipping (default: disabled) - For **Codex/Gemini** backends: Currently has no effect
- `CODEAGENT_MAX_PARALLEL_WORKERS`: Limit concurrent tasks in parallel mode (default: unlimited, recommended: 8) - `CODEAGENT_MAX_PARALLEL_WORKERS`: Limit concurrent tasks in parallel mode (default: unlimited, recommended: 8)
## Invocation Pattern ## Invocation Pattern
@@ -182,9 +194,8 @@ Bash tool parameters:
## Security Best Practices ## Security Best Practices
- **Claude Backend**: Defaults to `--dangerously-skip-permissions` for automation workflows - **Claude Backend**: Permission checks enabled by default
- To enforce permission checks with Claude: Set `CODEAGENT_SKIP_PERMISSIONS=true` - To skip checks: set `CODEAGENT_SKIP_PERMISSIONS=true` or pass `--skip-permissions`
- **Codex/Gemini Backends**: Permission checks enabled by default
- **Concurrency Limits**: Set `CODEAGENT_MAX_PARALLEL_WORKERS` in production to prevent resource exhaustion - **Concurrency Limits**: Set `CODEAGENT_MAX_PARALLEL_WORKERS` in production to prevent resource exhaustion
- **Automation Context**: This wrapper is designed for AI-driven automation where permission prompts would block execution - **Automation Context**: This wrapper is designed for AI-driven automation where permission prompts would block execution