feat(workflow): add lightweight interactive planning workflow with in-memory execution and code exploration

- Introduced `lite-plan` command for intelligent task analysis and planning.
- Implemented dynamic exploration and clarification phases based on task complexity.
- Added support for auto mode and forced exploration flags.
- Defined output artifacts and session structure for planning results.
- Enhanced execution process with context handoff to `lite-execute`.

chore(temp): create temporary memory content and import script

- Added `.temp-memory-content.txt` to store session details and execution plan.
- Implemented `temp-import-memory.cjs` to handle memory import using core-memory command.
- Ensured cleanup of temporary files after execution.
This commit is contained in:
catlog22
2026-02-27 11:43:44 +08:00
parent 07452e57b7
commit 4d755ff9b4
48 changed files with 5659 additions and 82 deletions

View File

@@ -0,0 +1,163 @@
# Command: code-review
## Purpose
4-dimension code review analyzing quality, security, architecture, and requirements compliance. Produces a verdict (BLOCK/CONDITIONAL/APPROVE) with categorized findings.
## Phase 2: Context Loading
| Input | Source | Required |
|-------|--------|----------|
| Plan file | `<session_folder>/plan/plan.json` | Yes |
| Git diff | `git diff HEAD~1` or `git diff --cached` | Yes |
| Modified files | From git diff --name-only | Yes |
| Test results | Tester output (if available) | No |
| Wisdom | `<session_folder>/wisdom/` | No |
## Phase 3: 4-Dimension Review
### Dimension Overview
| Dimension | Focus | Weight |
|-----------|-------|--------|
| Quality | Code correctness, type safety, clean code | Equal |
| Security | Vulnerability patterns, secret exposure | Equal |
| Architecture | Module structure, coupling, file size | Equal |
| Requirements | Acceptance criteria coverage, completeness | Equal |
---
### Dimension 1: Quality
Scan each modified file for quality anti-patterns.
| Severity | Pattern | What to Detect |
|----------|---------|----------------|
| Critical | Empty catch blocks | `catch(e) {}` with no handling |
| High | @ts-ignore without justification | Suppression comment < 10 chars explanation |
| High | `any` type in public APIs | `any` outside comments and generic definitions |
| High | console.log in production | `console.(log\|debug\|info)` outside test files |
| Medium | Magic numbers | Numeric literals > 1 digit, not in const/comment |
| Medium | Duplicate code | Identical lines (>30 chars) appearing 3+ times |
**Detection example** (Grep for console statements):
```bash
Grep(pattern="console\\.(log|debug|info)", path="<file_path>", output_mode="content", "-n"=true)
```
---
### Dimension 2: Security
Scan for vulnerability patterns across all modified files.
| Severity | Pattern | What to Detect |
|----------|---------|----------------|
| Critical | Hardcoded secrets | `api_key=`, `password=`, `secret=`, `token=` with string values (20+ chars) |
| Critical | SQL injection | String concatenation in `query()`/`execute()` calls |
| High | eval/exec usage | `eval()`, `new Function()`, `setTimeout(string)` |
| High | XSS vectors | `innerHTML`, `dangerouslySetInnerHTML` |
| Medium | Insecure random | `Math.random()` in security context (token/key/password/session) |
| Low | Missing input validation | Functions with parameters but no validation in first 5 lines |
---
### Dimension 3: Architecture
Assess structural health of modified files.
| Severity | Pattern | What to Detect |
|----------|---------|----------------|
| Critical | Circular dependencies | File A imports B, B imports A |
| High | Excessive parent imports | Import traverses >2 parent directories (`../../../`) |
| Medium | Large files | Files exceeding 500 lines |
| Medium | Tight coupling | >5 imports from same base module |
| Medium | Long functions | Functions exceeding 50 lines |
| Medium | Module boundary changes | Modifications to index.ts/index.js files |
**Detection example** (check for deep parent imports):
```bash
Grep(pattern="from\\s+['\"](\\.\\./){3,}", path="<file_path>", output_mode="content", "-n"=true)
```
---
### Dimension 4: Requirements
Verify implementation against plan acceptance criteria.
| Severity | Check | Method |
|----------|-------|--------|
| High | Unmet acceptance criteria | Extract criteria from plan, check keyword overlap (threshold: 70%) |
| High | Missing error handling | Plan mentions "error handling" but no try/catch in code |
| Medium | Partially met criteria | Keyword overlap 40-69% |
| Medium | Missing tests | Plan mentions "test" but no test files in modified set |
**Verification flow**:
1. Read plan file → extract acceptance criteria section
2. For each criterion → extract keywords (4+ char meaningful words)
3. Search modified files for keyword matches
4. Score: >= 70% match = met, 40-69% = partial, < 40% = unmet
---
### Verdict Routing
| Verdict | Criteria | Action |
|---------|----------|--------|
| BLOCK | Any critical-severity issues found | Must fix before merge |
| CONDITIONAL | High or medium issues, no critical | Should address, can merge with tracking |
| APPROVE | Only low issues or none | Ready to merge |
## Phase 4: Validation
### Report Format
The review report follows this structure:
```
# Code Review Report
**Verdict**: <BLOCK|CONDITIONAL|APPROVE>
## Blocking Issues (if BLOCK)
- **<type>** (<file>:<line>): <message>
## Review Dimensions
### Quality Issues
**CRITICAL** (<count>)
- <message> (<file>:<line>)
### Security Issues
(same format per severity)
### Architecture Issues
(same format per severity)
### Requirements Issues
(same format per severity)
## Recommendations
1. <actionable recommendation>
```
### Summary Counts
| Field | Description |
|-------|-------------|
| Total issues | Sum across all dimensions and severities |
| Critical count | Must be 0 for APPROVE |
| Blocking issues | Listed explicitly in report header |
| Dimensions covered | Must be 4/4 |
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Plan file not found | Skip requirements dimension, note in report |
| Git diff empty | Report no changes to review |
| File read fails | Skip file, note in report |
| No modified files | Report empty review |

View File

@@ -0,0 +1,202 @@
# Command: spec-quality
## Purpose
5-dimension spec quality check with weighted scoring, quality gate determination, and readiness report generation.
## Phase 2: Context Loading
| Input | Source | Required |
|-------|--------|----------|
| Spec documents | `<session_folder>/spec/` (all .md files) | Yes |
| Original requirements | Product brief objectives section | Yes |
| Quality gate config | specs/quality-gates.md | No |
| Session folder | Task description `Session:` field | Yes |
**Spec document phases** (matched by filename/directory):
| Phase | Expected Path | Required |
|-------|--------------|---------|
| product-brief | spec/product-brief.md | Yes |
| prd | spec/requirements/*.md | Yes |
| architecture | spec/architecture/_index.md + ADR-*.md | Yes |
| user-stories | spec/epics/*.md | Yes |
| implementation-plan | plan/plan.json | No (impl-only/full-lifecycle) |
| test-strategy | spec/test-strategy.md | No (optional, not generated by pipeline) |
## Phase 3: 5-Dimension Scoring
### Dimension Weights
| Dimension | Weight | Focus |
|-----------|--------|-------|
| Completeness | 25% | All required sections present with substance |
| Consistency | 20% | Terminology, format, references, naming |
| Traceability | 25% | Goals -> Reqs -> Components -> Stories chain |
| Depth | 20% | AC testable, ADRs justified, stories estimable |
| Coverage | 10% | Original requirements mapped to spec |
---
### Dimension 1: Completeness (25%)
Check each spec document for required sections.
**Required sections by phase**:
| Phase | Required Sections |
|-------|------------------|
| product-brief | Vision Statement, Problem Statement, Target Audience, Success Metrics, Constraints |
| prd | Goals, Requirements, User Stories, Acceptance Criteria, Non-Functional Requirements |
| architecture | System Overview, Component Design, Data Models, API Specifications, Technology Stack |
| user-stories | Story List, Acceptance Criteria, Priority, Estimation |
| implementation-plan | Task Breakdown, Dependencies, Timeline, Resource Allocation |
> **Note**: `test-strategy` is optional — skip scoring if `spec/test-strategy.md` is absent. Do not penalize completeness score for missing optional phases.
**Scoring formula**:
- Section present: 50% credit
- Section has substantial content (>100 chars beyond header): additional 50% credit
- Per-document score = (present_ratio * 50) + (substantial_ratio * 50)
- Overall = average across all documents
---
### Dimension 2: Consistency (20%)
Check cross-document consistency across four areas.
| Area | What to Check | Severity |
|------|--------------|----------|
| Terminology | Same concept with different casing/spelling across docs | Medium |
| Format | Mixed header styles at same level across docs | Low |
| References | Broken links (`./` or `../` paths that don't resolve) | High |
| Naming | Mixed naming conventions (camelCase vs snake_case vs kebab-case) | Low |
**Scoring**:
- Penalty weights: High = 10, Medium = 5, Low = 2
- Score = max(0, 100 - (total_penalty / 100) * 100)
---
### Dimension 3: Traceability (25%)
Build and validate traceability chains: Goals -> Requirements -> Components -> Stories.
**Chain building flow**:
1. Extract goals from product-brief (pattern: `- Goal: <text>`)
2. Extract requirements from PRD (pattern: `- REQ-NNN: <text>`)
3. Extract components from architecture (pattern: `- Component: <text>`)
4. Extract stories from user-stories (pattern: `- US-NNN: <text>`)
5. Link by keyword overlap (threshold: 30% keyword match)
**Chain completeness**: A chain is complete when a goal links to at least one requirement, one component, and one story.
**Scoring**: (complete chains / total chains) * 100
**Weak link identification**: For each incomplete chain, report which link is missing (no requirements, no components, or no stories).
---
### Dimension 4: Depth (20%)
Assess the analytical depth of spec content across four sub-dimensions.
| Sub-dimension | Source | Testable Criteria |
|---------------|--------|-------------------|
| AC Testability | PRD / User Stories | Contains measurable verbs (display, return, validate) or Given/When/Then or numbers |
| ADR Justification | Architecture | Contains rationale, alternatives, consequences, or trade-offs |
| Story Estimability | User Stories | Has "As a/I want/So that" + AC, or explicit estimate |
| Technical Detail | Architecture + Plan | Contains code blocks, API terms, HTTP methods, DB terms |
**Scoring**: Average of sub-dimension scores (each 0-100%)
---
### Dimension 5: Coverage (10%)
Map original requirements to spec requirements.
**Flow**:
1. Extract original requirements from product-brief objectives section
2. Extract spec requirements from all documents (pattern: `- REQ-NNN:` or `- Requirement:` or `- Feature:`)
3. For each original requirement, check keyword overlap with any spec requirement (threshold: 40%)
4. Score = (covered_count / total_original) * 100
---
### Quality Gate Decision Table
| Gate | Criteria | Message |
|------|----------|---------|
| PASS | Overall score >= 80% AND coverage >= 70% | Ready for implementation |
| FAIL | Overall score < 60% OR coverage < 50% | Major revisions required |
| REVIEW | All other cases | Improvements needed, may proceed with caution |
## Phase 4: Validation
### Readiness Report Format
Write to `<session_folder>/spec/readiness-report.md`:
```
# Specification Readiness Report
**Generated**: <timestamp>
**Overall Score**: <score>%
**Quality Gate**: <PASS|REVIEW|FAIL> - <message>
**Recommended Action**: <action>
## Dimension Scores
| Dimension | Score | Weight | Weighted Score |
|-----------|-------|--------|----------------|
| Completeness | <n>% | 25% | <n>% |
| Consistency | <n>% | 20% | <n>% |
| Traceability | <n>% | 25% | <n>% |
| Depth | <n>% | 20% | <n>% |
| Coverage | <n>% | 10% | <n>% |
## Completeness Analysis
(per-phase breakdown: sections present/expected, missing sections)
## Consistency Analysis
(issues by area: terminology, format, references, naming)
## Traceability Analysis
(complete chains / total, weak links)
## Depth Analysis
(per sub-dimension scores)
## Requirement Coverage
(covered / total, uncovered requirements list)
```
### Spec Summary Format
Write to `<session_folder>/spec/spec-summary.md`:
```
# Specification Summary
**Overall Quality Score**: <score>%
**Quality Gate**: <gate>
## Documents Reviewed
(per document: phase, path, size, section list)
## Key Findings
### Strengths (dimensions scoring >= 80%)
### Areas for Improvement (dimensions scoring < 70%)
### Recommendations
```
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Spec folder empty | FAIL gate, report no documents found |
| Missing phase document | Score 0 for that phase in completeness, note in report |
| No original requirements found | Score coverage at 100% (nothing to cover) |
| Broken references | Flag in consistency, do not fail entire review |

View File

@@ -0,0 +1,133 @@
# Role: reviewer
Dual-mode review: code review (REVIEW-*) and spec quality validation (QUALITY-*). QUALITY tasks include inline discuss (DISCUSS-006) for final sign-off.
## Identity
- **Name**: `reviewer` | **Prefix**: `REVIEW-*` + `QUALITY-*` | **Tag**: `[reviewer]`
- **Responsibility**: Branch by Prefix -> Review/Score -> **Inline Discuss (QUALITY only)** -> Report
## Boundaries
### MUST
- Process REVIEW-* and QUALITY-* tasks
- Generate readiness-report.md for QUALITY tasks
- Cover all required dimensions per mode
- Call discuss subagent for DISCUSS-006 after QUALITY-001
### MUST NOT
- Create tasks
- Modify source code
- Skip quality dimensions
- Approve without verification
## Message Types
| Type | Direction | Trigger |
|------|-----------|---------|
| review_result | -> coordinator | Code review complete |
| quality_result | -> coordinator | Spec quality + discuss complete |
| fix_required | -> coordinator | Critical issues found |
## Toolbox
| Tool | Purpose |
|------|---------|
| commands/code-review.md | 4-dimension code review |
| commands/spec-quality.md | 5-dimension spec quality |
| discuss subagent | Inline DISCUSS-006 (QUALITY tasks only) |
---
## Mode Detection
| Task Prefix | Mode | Dimensions | Inline Discuss |
|-------------|------|-----------|---------------|
| REVIEW-* | Code Review | quality, security, architecture, requirements | None |
| QUALITY-* | Spec Quality | completeness, consistency, traceability, depth, coverage | DISCUSS-006 |
---
## Code Review (REVIEW-*)
**Inputs**: Plan file, git diff, modified files, test results (if available)
**4 dimensions** (delegate to commands/code-review.md):
| Dimension | Critical Issues |
|-----------|----------------|
| Quality | Empty catch, any in public APIs, @ts-ignore, console.log |
| Security | Hardcoded secrets, SQL injection, eval/exec, innerHTML |
| Architecture | Circular deps, parent imports >2 levels, files >500 lines |
| Requirements | Missing core functionality, incomplete acceptance criteria |
**Verdict**:
| Verdict | Criteria |
|---------|----------|
| BLOCK | Critical issues present |
| CONDITIONAL | High/medium only |
| APPROVE | Low or none |
---
## Spec Quality (QUALITY-*)
**Inputs**: All spec docs in session folder, quality gate config
**5 dimensions** (delegate to commands/spec-quality.md):
| Dimension | Weight | Focus |
|-----------|--------|-------|
| Completeness | 25% | All sections present with substance |
| Consistency | 20% | Terminology, format, references |
| Traceability | 25% | Goals -> Reqs -> Arch -> Stories chain |
| Depth | 20% | AC testable, ADRs justified, stories estimable |
| Coverage | 10% | Original requirements mapped |
**Quality gate**:
| Gate | Criteria |
|------|----------|
| PASS | Score >= 80% AND coverage >= 70% |
| REVIEW | Score 60-79% OR coverage 50-69% |
| FAIL | Score < 60% OR coverage < 50% |
**Artifacts**: readiness-report.md + spec-summary.md
### Inline Discuss (DISCUSS-006) -- QUALITY tasks only
After generating readiness-report.md, call discuss subagent for final sign-off:
```
Task({
subagent_type: "cli-discuss-agent",
run_in_background: false,
description: "Discuss DISCUSS-006",
prompt: `## Multi-Perspective Critique: DISCUSS-006
### Input
- Artifact: <session-folder>/spec/readiness-report.md
- Round: DISCUSS-006
- Perspectives: product, technical, quality, risk, coverage
- Session: <session-folder>
- Discovery Context: <session-folder>/spec/discovery-context.json
<rest of discuss subagent prompt from subagents/discuss-subagent.md>`
})
```
**Discuss result handling**:
- `consensus_reached` -> include in quality report as final endorsement
- `consensus_blocked` -> flag in SendMessage, report specific divergences
---
## Error Handling
| Scenario | Resolution |
|----------|------------|
| Missing context | Request from coordinator |
| Invalid mode | Abort with error |
| Analysis failure | Retry, then fallback template |
| Discuss subagent fails | Proceed without final discuss, log warning |