# CSV Schema — Project Documentation Workflow (Optimized) Dynamic task decomposition with topological wave computation. ## tasks.csv (Master State) ### Column Definitions | Column | Type | Required | Description | Example | |--------|------|----------|-------------|---------| | `id` | string | Yes | Task ID (doc-NNN, auto-generated) | `"doc-001"` | | `title` | string | Yes | Document title | `"系统架构图"` | | `description` | string | Yes | Detailed task description (self-contained) | `"绘制系统架构图..."` | | `doc_type` | enum | Yes | Document type | `"architecture"` | | `target_scope` | string | Yes | File scope (glob pattern) | `"src/**"` | | `doc_sections` | string | Yes | Required sections (comma-separated) | `"components,dependencies"` | | `formula_support` | boolean | No | LaTeX formula support needed | `"true"` | | `priority` | enum | No | Task priority | `"high"` | | `deps` | string | No | Dependency task IDs (semicolon-separated) | `"doc-001;doc-002"` | | `context_from` | string | No | Context source task IDs | `"doc-001;doc-003"` | | `wave` | integer | Computed | Wave number (computed by topological sort) | `1` | | `status` | enum | Output | `pending` → `completed`/`failed`/`skipped` | `"completed"` | | `findings` | string | Output | Key findings summary (max 500 chars) | `"Found 3 main components..."` | | `doc_path` | string | Output | Generated document path | `"docs/02-architecture/system-architecture.md"` | | `key_discoveries` | string | Output | Key discoveries (JSON array) | `"[{\"name\":\"...\",\"type\":\"...\"}]"` | | `error` | string | Output | Error message if failed | `""` | ### doc_type Values | Value | Typical Wave | Description | |-------|--------------|-------------| | `overview` | 1 | Project overview, tech stack, structure | | `architecture` | 2 | System architecture, patterns, interactions | | `implementation` | 3 | Algorithms, data structures, utilities | | `theory` | 3 | Mathematical foundations, formulas (LaTeX) | | `feature` | 4 | Feature documentation | | `usage` | 4 | Usage guide, installation, configuration | | `api` | 4 | API reference | | `synthesis` | 5+ | Design philosophy, best practices, summary | ### priority Values | Value | Description | Typical Use | |-------|-------------|-------------| | `high` | Essential document | overview, architecture | | `medium` | Useful but optional | implementation details | | `low` | Nice to have | extended examples | --- ## Dynamic Task Generation ### Task Count Guidelines | Project Scale | File Count | Recommended Tasks | Waves | |--------------|------------|-------------------|-------| | Small | < 20 files | 5-8 tasks | 2-3 | | Medium | 20-100 files | 10-15 tasks | 3-4 | | Large | > 100 files | 15-25 tasks | 4-6 | ### Project Type → Task Templates | Project Type | Essential Tasks | Optional Tasks | |-------------|-----------------|----------------| | **Library** | overview, api-reference, usage-guide | design-patterns, best-practices | | **Application** | overview, architecture, feature-list, usage-guide | api-reference, deployment | | **Service/API** | overview, architecture, api-reference | module-interactions, deployment | | **CLI Tool** | overview, usage-guide, api-reference | architecture | | **Numerical/Scientific** | overview, architecture, theoretical-foundations | algorithms, data-structures | --- ## Wave Computation (Topological Sort) ### Algorithm: Kahn's BFS ``` Input: tasks with deps field Output: tasks with wave field 1. Build adjacency list from deps 2. Initialize in-degree for each task 3. Queue tasks with in-degree 0 (Wave 1) 4. While queue not empty: a. Current wave = all queued tasks b. For each completed task, decrement dependents' in-degree c. Queue tasks with in-degree 0 for next wave 5. Assign wave numbers 6. Detect cycles: if unassigned tasks remain → circular dependency ``` ### Dependency Rules | doc_type | Typical deps | Rationale | |----------|--------------|-----------| | `overview` | (none) | Foundation tasks | | `architecture` | `overview` tasks | Needs project understanding | | `implementation` | `architecture` tasks | Needs design context | | `theory` | `overview` + `architecture` | Needs model understanding | | `feature` | `implementation` tasks | Needs code knowledge | | `api` | `implementation` tasks | Needs function signatures | | `usage` | `feature` tasks | Needs feature knowledge | | `synthesis` | Most other tasks | Integrates all findings | --- ## Example CSV (Small Project - 7 tasks, 3 waves) ```csv id,title,description,doc_type,target_scope,doc_sections,formula_support,priority,deps,context_from,wave,status,findings,doc_path,key_discoveries,error "doc-001","项目概述","撰写项目概述","overview","README.md,package.json","purpose,background,audience","false","high","","","1","pending","","","","" "doc-002","技术栈","分析技术栈","overview","package.json,tsconfig.json","languages,frameworks,dependencies","false","medium","","doc-001","1","pending","","","","" "doc-003","系统架构","绘制架构图","architecture","src/**","components,dependencies,dataflow","false","high","doc-001","doc-001;doc-002","2","pending","","","","" "doc-004","核心算法","文档化核心算法","implementation","src/core/**","algorithms,complexity,examples","false","high","doc-003","doc-003","3","pending","","","","" "doc-005","API参考","API文档","api","src/**/*.ts","endpoints,parameters,examples","false","high","doc-003","doc-003;doc-004","3","pending","","","","" "doc-006","使用指南","使用说明","usage","README.md,examples/**","installation,configuration,running","false","high","doc-004;doc-005","doc-004;doc-005","4","pending","","","","" "doc-007","最佳实践","推荐用法","synthesis","src/**,examples/**","recommendations,pitfalls,examples","false","medium","doc-006","doc-004;doc-005;doc-006","5","pending","","","","" ``` ### Computed Wave Distribution | Wave | Tasks | Parallelism | |------|-------|-------------| | 1 | doc-001, doc-002 | 2 concurrent | | 2 | doc-003 | 1 (sequential) | | 3 | doc-004, doc-005 | 2 concurrent | | 4 | doc-006 | 1 (sequential) | | 5 | doc-007 | 1 (sequential) | --- ## Per-Wave CSV (Temporary) Extra columns added by Wave Engine: | Column | Type | Description | |--------|------|-------------| | `prev_context` | string | Aggregated findings + Wave Summary + Relevant Discoveries | ### prev_context Assembly ```javascript prev_context = // 1. From context_from tasks context_from.map(id => task.findings).join('\n\n') + // 2. From Wave Summary (if wave > 1) '\n\n## Previous Wave Summary\n' + waveSummary + // 3. From Discoveries (filtered by relevance) '\n\n## Relevant Discoveries\n' + relevantDiscoveries ``` --- ## Output Schema (Agent Report) ```json { "id": "doc-003", "status": "completed", "findings": "Identified 4 core components: Parser, Analyzer, Generator, Exporter. Data flows left-to-right with feedback loop for error recovery. Main entry point is src/index.ts.", "doc_path": "docs/02-architecture/system-architecture.md", "key_discoveries": "[{\"name\":\"Parser\",\"type\":\"component\",\"file\":\"src/parser/index.ts\",\"description\":\"Transforms input to AST\"}]", "error": "" } ``` --- ## Validation Rules | Rule | Check | Error | |------|-------|-------| | Unique IDs | No duplicate `id` values | "Duplicate task ID: {id}" | | Valid deps | All dep IDs exist in task list | "Unknown dependency: {dep_id}" | | No self-deps | Task cannot depend on itself | "Self-dependency: {id}" | | No cycles | Topological sort completes | "Circular dependency involving: {ids}" | | Context valid | All context_from IDs in earlier or same wave | "Invalid context_from: {id}" | | Valid doc_type | doc_type ∈ enum values | "Invalid doc_type: {type}" | | Valid priority | priority ∈ {high,medium,low} | "Invalid priority: {priority}" | | Status enum | status ∈ {pending,completed,failed,skipped} | "Invalid status" | --- ## Wave Summary Schema Each wave generates a summary file: `wave-summaries/wave-{N}-summary.md` ```markdown # Wave {N} Summary **Completed Tasks**: {count} ## By Document Type ### {doc_type} #### {task.title} {task.findings (truncated to 300 chars)} **Key Points**: - {discovery.name}: {discovery.description} ... ## Context for Wave {N+1} Next wave will focus on: {next_wave_task_titles} ```