feat: Implement DeepWiki documentation generation tools

- Added `__init__.py` in `codexlens/tools` for documentation generation. - Created `deepwiki_generator.py` to handle symbol extraction and markdown generation. - Introduced `MockMarkdownGenerator` for testing purposes. - Implemented `DeepWikiGenerator` class for managing documentation generation and file processing. - Added unit tests for `DeepWikiStore` to ensure proper functionality and error handling. - Created tests for DeepWiki TypeScript types matching.
2026-03-07 16:41:06 +08:00 · 2026-03-05 18:30:56 +08:00
parent 0bfae3fd1a
commit fb4f6e718e
62 changed files with 7500 additions and 68 deletions
--- a/.claude/skills/spec-generator/phases/01-discovery.md
+++ b/.claude/skills/spec-generator/phases/01-discovery.md
@@ -185,6 +185,17 @@ if (!autoMode) {
        header: "Focus",
        multiSelect: true,
        options: seedAnalysis.dimensions.map(d => ({ label: d, description: `Explore ${d} in depth` }))
+      },
+      {
+        question: "What type of specification is this?",
+        header: "Spec Type",
+        multiSelect: false,
+        options: [
+          { label: "Service (Recommended)", description: "Long-running service with lifecycle, state machine, observability" },
+          { label: "API", description: "REST/GraphQL API with endpoints, auth, rate limiting" },
+          { label: "Library/SDK", description: "Reusable package with public API surface, examples" },
+          { label: "Platform", description: "Multi-component system, uses Service profile" }
+        ]
      }
    ]
  });
@@ -192,6 +203,7 @@ if (!autoMode) {
  // Auto mode defaults
  depth = "standard";
  focusAreas = seedAnalysis.dimensions;
+  specType = "service"; // default for auto mode
 }
 ```

@@ -209,6 +221,9 @@ const specConfig = {
  focus_areas: focusAreas,
  seed_analysis: seedAnalysis,
  has_codebase: hasCodebase,
+  spec_type: specType, // "service" | "api" | "library" | "platform"
+  iteration_count: 0,
+  iteration_history: [],
  phasesCompleted: [
    {
      phase: 1,
--- a/.claude/skills/spec-generator/phases/02-product-brief.md
+++ b/.claude/skills/spec-generator/phases/02-product-brief.md
@@ -227,6 +227,33 @@ specConfig.phasesCompleted.push({
 Write(`${workDir}/spec-config.json`, JSON.stringify(specConfig, null, 2));
 ```

+### Step 5.5: Generate glossary.json
+
+```javascript
+// Extract terminology from product brief and CLI analysis
+// Generate structured glossary for cross-document consistency
+
+const glossary = {
+  session_id: specConfig.session_id,
+  terms: [
+    // Extract from product brief content:
+    // - Key domain nouns from problem statement
+    // - User persona names
+    // - Technical terms from multi-perspective synthesis
+    // Each term should have:
+    // { term: "...", definition: "...", aliases: [], first_defined_in: "product-brief.md", category: "core|technical|business" }
+  ]
+};
+
+Write(`${workDir}/glossary.json`, JSON.stringify(glossary, null, 2));
+```
+
+**Glossary Injection**: In all subsequent phase prompts, inject the following into the CONTEXT section:
+```
+TERMINOLOGY GLOSSARY (use these terms consistently):
+${JSON.stringify(glossary.terms, null, 2)}
+```
+
 ## Output

 - **File**: `product-brief.md`
--- a/.claude/skills/spec-generator/phases/03-requirements.md
+++ b/.claude/skills/spec-generator/phases/03-requirements.md
@@ -60,6 +60,11 @@ TASK:
  - Should: Important but has workaround
  - Could: Nice-to-have, enhances experience
  - Won't: Explicitly deferred
+- Use RFC 2119 keywords (MUST, SHOULD, MAY, MUST NOT, SHOULD NOT) to define behavioral constraints for each requirement. Example: 'The system MUST return a 401 response within 100ms for invalid tokens.'
+- For each core domain entity referenced in requirements, define its data model: fields, types, constraints, and relationships to other entities
+- Maintain terminology consistency with the glossary below:
+  TERMINOLOGY GLOSSARY:
+  \${glossary ? JSON.stringify(glossary.terms, null, 2) : 'N/A - generate terms inline'}

 MODE: analysis
 EXPECTED: Structured requirements with: ID, title, description, user story, acceptance criteria, priority, traceability to goals
--- a/.claude/skills/spec-generator/phases/04-architecture.md
+++ b/.claude/skills/spec-generator/phases/04-architecture.md
@@ -33,6 +33,19 @@ if (specConfig.has_codebase) {
    discoveryContext = JSON.parse(Read(`${workDir}/discovery-context.json`));
  } catch (e) { /* no context */ }
 }
+
+// Load glossary for terminology consistency
+let glossary = null;
+try {
+  glossary = JSON.parse(Read(`${workDir}/glossary.json`));
+} catch (e) { /* proceed without */ }
+
+// Load spec type profile for specialized sections
+const specType = specConfig.spec_type || 'service';
+let profile = null;
+try {
+  profile = Read(`templates/profiles/${specType}-profile.md`);
+} catch (e) { /* use base template only */ }
 ```

 ### Step 2: Architecture Analysis via Gemini CLI
@@ -66,6 +79,28 @@ TASK:
 - Identify security architecture: auth, authorization, data protection
 - List API endpoints (high-level)
 ${discoveryContext ? '- Map new components to existing codebase modules' : ''}
+- For each core entity with a lifecycle, create an ASCII state machine diagram showing:
+  - All states and transitions
+  - Trigger events for each transition
+  - Side effects of transitions
+  - Error states and recovery paths
+- Define a Configuration Model: list all configurable fields with name, type, default value, constraint, and description
+- Define Error Handling strategy:
+  - Classify errors (transient/permanent/degraded)
+  - Per-component error behavior using RFC 2119 keywords
+  - Recovery mechanisms
+- Define Observability requirements:
+  - Key metrics (name, type: counter/gauge/histogram, labels)
+  - Structured log format and key log events
+  - Health check endpoints
+\${profile ? \`
+SPEC TYPE PROFILE REQUIREMENTS (\${specType}):
+\${profile}
+\` : ''}
+\${glossary ? \`
+TERMINOLOGY GLOSSARY (use consistently):
+\${JSON.stringify(glossary.terms, null, 2)}
+\` : ''}

 MODE: analysis
 EXPECTED: Complete architecture with: style justification, component diagram, tech stack table, ADRs, data model, security controls, API overview
--- a/.claude/skills/spec-generator/phases/05-epics-stories.md
+++ b/.claude/skills/spec-generator/phases/05-epics-stories.md
@@ -26,6 +26,11 @@ const specConfig = JSON.parse(Read(`${workDir}/spec-config.json`));
 const productBrief = Read(`${workDir}/product-brief.md`);
 const requirements = Read(`${workDir}/requirements.md`);
 const architecture = Read(`${workDir}/architecture.md`);
+
+let glossary = null;
+try {
+  glossary = JSON.parse(Read(`${workDir}/glossary.json`));
+} catch (e) { /* proceed without */ }
 ```

 ### Step 2: Epic Decomposition via Gemini CLI
@@ -69,10 +74,11 @@ TASK:

 MODE: analysis
 EXPECTED: Structured output with: Epic list (ID, title, priority, MVP flag), Stories per Epic (ID, user story, AC, size, trace), dependency Mermaid diagram, execution order, MVP definition
-CONSTRAINTS: 
+CONSTRAINTS:
 - Every Must-have requirement must appear in at least one Story
 - Stories must be small enough to implement independently (no XL stories in MVP)
 - Dependencies should be minimized across Epics
+\${glossary ? \`- Maintain terminology consistency with glossary: \${glossary.terms.map(t => t.term).join(', ')}\` : ''}
 " --tool gemini --mode analysis`,
  run_in_background: true
 });
--- a/.claude/skills/spec-generator/phases/06-5-auto-fix.md
+++ b/.claude/skills/spec-generator/phases/06-5-auto-fix.md
@@ -0,0 +1,144 @@
+# Phase 6.5: Auto-Fix
+
+Automatically repair specification issues identified in Phase 6 Readiness Check.
+
+## Objective
+
+- Parse readiness-report.md to extract Error and Warning items
+- Group issues by originating Phase (2-5)
+- Re-generate affected sections with error context injected into CLI prompts
+- Re-run Phase 6 validation after fixes
+
+## Input
+
+- Dependency: `{workDir}/readiness-report.md` (Phase 6 output)
+- Config: `{workDir}/spec-config.json` (with iteration_count)
+- All Phase 2-5 outputs
+
+## Execution Steps
+
+### Step 1: Parse Readiness Report
+
+```javascript
+const readinessReport = Read(`${workDir}/readiness-report.md`);
+const specConfig = JSON.parse(Read(`${workDir}/spec-config.json`));
+
+// Load glossary for terminology consistency during fixes
+let glossary = null;
+try {
+  glossary = JSON.parse(Read(`${workDir}/glossary.json`));
+} catch (e) { /* proceed without */ }
+
+// Extract issues from readiness report
+// Parse Error and Warning severity items
+// Group by originating phase:
+//   Phase 2 issues: vision, problem statement, scope, personas
+//   Phase 3 issues: requirements, acceptance criteria, priority, traceability
+//   Phase 4 issues: architecture, ADRs, tech stack, data model, state machine
+//   Phase 5 issues: epics, stories, dependencies, MVP scope
+
+const issuesByPhase = {
+  2: [], // product brief issues
+  3: [], // requirements issues
+  4: [], // architecture issues
+  5: []  // epics issues
+};
+
+// Parse structured issues from report
+// Each issue: { severity: "Error"|"Warning", description: "...", location: "file:section" }
+
+// Map phase numbers to output files
+const phaseOutputFile = {
+  2: 'product-brief.md',
+  3: 'requirements/_index.md',
+  4: 'architecture/_index.md',
+  5: 'epics/_index.md'
+};
+```
+
+### Step 2: Fix Affected Phases (Sequential)
+
+For each phase with issues (in order 2 -> 3 -> 4 -> 5):
+
+```javascript
+for (const [phase, issues] of Object.entries(issuesByPhase)) {
+  if (issues.length === 0) continue;
+
+  const errorContext = issues.map(i => `[${i.severity}] ${i.description} (at ${i.location})`).join('\n');
+
+  // Read current phase output
+  const currentOutput = Read(`${workDir}/${phaseOutputFile[phase]}`);
+
+  Bash({
+    command: `ccw cli -p "PURPOSE: Fix specification issues identified in readiness check for Phase ${phase}.
+Success: All listed issues resolved while maintaining consistency with other documents.
+
+CURRENT DOCUMENT:
+${currentOutput.slice(0, 5000)}
+
+ISSUES TO FIX:
+${errorContext}
+
+${glossary ? `GLOSSARY (maintain consistency):
+${JSON.stringify(glossary.terms, null, 2)}` : ''}
+
+TASK:
+- Address each listed issue specifically
+- Maintain all existing content that is not flagged
+- Ensure terminology consistency with glossary
+- Preserve YAML frontmatter and cross-references
+- Use RFC 2119 keywords for behavioral requirements
+- Increment document version number
+
+MODE: analysis
+EXPECTED: Corrected document content addressing all listed issues
+CONSTRAINTS: Minimal changes - only fix flagged issues, do not restructure unflagged sections
+" --tool gemini --mode analysis`,
+    run_in_background: true
+  });
+
+  // Wait for result, apply fixes to document
+  // Update document version in frontmatter
+}
+```
+
+### Step 3: Update State
+
+```javascript
+specConfig.phasesCompleted.push({
+  phase: 6.5,
+  name: "auto-fix",
+  iteration: specConfig.iteration_count,
+  phases_fixed: Object.keys(issuesByPhase).filter(p => issuesByPhase[p].length > 0),
+  completed_at: new Date().toISOString()
+});
+Write(`${workDir}/spec-config.json`, JSON.stringify(specConfig, null, 2));
+```
+
+### Step 4: Re-run Phase 6 Validation
+
+```javascript
+// Re-execute Phase 6: Readiness Check
+// This creates a new readiness-report.md
+// If still Fail and iteration_count < 2: loop back to Step 1
+// If Pass or iteration_count >= 2: proceed to handoff
+```
+
+## Output
+
+- **Updated**: Phase 2-5 documents (only affected ones)
+- **Updated**: `spec-config.json` (iteration tracking)
+- **Triggers**: Phase 6 re-validation
+
+## Quality Checklist
+
+- [ ] All Error-severity issues addressed
+- [ ] Warning-severity issues attempted (best effort)
+- [ ] Document versions incremented for modified files
+- [ ] Terminology consistency maintained
+- [ ] Cross-references still valid after fixes
+- [ ] Iteration count not exceeded (max 2)
+
+## Next Phase
+
+Re-run [Phase 6: Readiness Check](06-readiness-check.md) to validate fixes.
--- a/.claude/skills/spec-generator/phases/06-readiness-check.md
+++ b/.claude/skills/spec-generator/phases/06-readiness-check.md
@@ -70,8 +70,12 @@ Perform 4-dimension validation:

 2. CONSISTENCY (25%):
   - Terminology uniform across documents?
+   - Terminology glossary compliance: all core terms used consistently per glossary.json definitions?
+   - No synonym drift (e.g., "user" vs "client" vs "consumer" for same concept)?
   - User personas consistent?
   - Scope consistent (PRD does not exceed brief)?
+   - Scope containment: PRD requirements do not exceed product brief's defined scope?
+   - Non-Goals respected: no requirement or story contradicts explicit Non-Goals?
   - Tech stack references match between architecture and epics?
   - Score 0-100 with inconsistencies listed

@@ -223,6 +227,10 @@ AskUserQuestion({
        {
          label: "Create Issues",
          description: "Generate issues for each Epic via /issue:new"
+        },
+        {
+          label: "Iterate & improve",
+          description: "Re-run failed phases based on readiness report issues (max 2 iterations)"
        }
      ]
    }
@@ -386,6 +394,29 @@ if (selection === "Create Issues") {
 }

 // If user selects "Other": Export only or return to specific phase
+
+if (selection === "Iterate & improve") {
+  // Check iteration count
+  if (specConfig.iteration_count >= 2) {
+    // Max iterations reached, force handoff
+    // Present handoff options again without iterate
+    return;
+  }
+
+  // Update iteration tracking
+  specConfig.iteration_count = (specConfig.iteration_count || 0) + 1;
+  specConfig.iteration_history.push({
+    iteration: specConfig.iteration_count,
+    timestamp: new Date().toISOString(),
+    readiness_score: overallScore,
+    errors_found: errorCount,
+    phases_to_fix: affectedPhases
+  });
+  Write(`${workDir}/spec-config.json`, JSON.stringify(specConfig, null, 2));
+
+  // Proceed to Phase 6.5: Auto-Fix
+  // Read phases/06-5-auto-fix.md and execute
+}
 ```

 #### Helper Functions Reference (pseudocode)