# Phase 1: Code Discovery & Scoping ## Objective Discover and categorize all code files within the specified scope, preparing them for security analysis and best practices review. ## Input - **User Arguments**: - `--scope`: Directory or file patterns (default: entire project) - `--languages`: Specific languages to review (e.g., typescript, python, java) - `--exclude`: Patterns to exclude (e.g., test files, node_modules) - **Configuration**: `.code-reviewer.json` (if exists) ## Process ### Step 1: Load Configuration ```javascript // Check for project-level configuration const configPath = path.join(projectRoot, '.code-reviewer.json'); const config = fileExists(configPath) ? JSON.parse(readFile(configPath)) : getDefaultConfig(); // Merge user arguments with config const scope = args.scope || config.scope.include; const exclude = args.exclude || config.scope.exclude; const languages = args.languages || config.languages || 'auto'; ``` ### Step 2: Discover Files Use MCP tools for efficient file discovery: ```javascript // Use smart_search for file discovery const files = await mcp__ccw_tools__smart_search({ action: "find_files", pattern: scope, includeHidden: false }); // Apply exclusion patterns const filteredFiles = files.filter(file => { return !exclude.some(pattern => minimatch(file, pattern)); }); ``` ### Step 3: Categorize Files Categorize files by: - **Language/Framework**: TypeScript, Python, Java, Go, etc. - **File Type**: Source, config, test, build - **Priority**: Critical (auth, payment), High (API), Medium (utils), Low (docs) ```javascript const inventory = { critical: { auth: ['src/auth/login.ts', 'src/auth/jwt.ts'], payment: ['src/payment/stripe.ts'], }, high: { api: ['src/api/users.ts', 'src/api/orders.ts'], database: ['src/db/queries.ts'], }, medium: { utils: ['src/utils/validator.ts'], services: ['src/services/*.ts'], }, low: { types: ['src/types/*.ts'], } }; ``` ### Step 4: Extract Metadata For each file, extract: - **Lines of Code (LOC)** - **Complexity Indicators**: Function count, class count - **Dependencies**: Import statements - **Framework Detection**: Express, React, Django, etc. ```javascript const metadata = files.map(file => ({ path: file, language: detectLanguage(file), loc: countLines(file), complexity: estimateComplexity(file), framework: detectFramework(file), priority: categorizePriority(file) })); ``` ## Output ### File Inventory Save to `.code-review/inventory.json`: ```json { "scan_date": "2024-01-15T10:30:00Z", "total_files": 247, "by_language": { "typescript": 185, "python": 42, "javascript": 15, "go": 5 }, "by_priority": { "critical": 12, "high": 45, "medium": 120, "low": 70 }, "files": [ { "path": "src/auth/login.ts", "language": "typescript", "loc": 245, "functions": 8, "classes": 2, "priority": "critical", "framework": "express", "dependencies": ["bcrypt", "jsonwebtoken", "express"] } ] } ``` ### Summary Report ```markdown ## Code Discovery Summary **Scope**: src/**/* **Total Files**: 247 **Languages**: TypeScript (75%), Python (17%), JavaScript (6%), Go (2%) ### Priority Distribution - Critical: 12 files (authentication, payment processing) - High: 45 files (API endpoints, database queries) - Medium: 120 files (utilities, services) - Low: 70 files (types, configs) ### Key Areas Identified 1. **Authentication Module** (src/auth/) - 12 files, 2,400 LOC 2. **Payment Processing** (src/payment/) - 5 files, 1,200 LOC 3. **API Layer** (src/api/) - 35 files, 5,600 LOC 4. **Database Layer** (src/db/) - 8 files, 1,800 LOC **Next Phase**: Security Analysis on Critical + High priority files ``` ## State Management Save phase state for potential resume: ```json { "phase": "01-code-discovery", "status": "completed", "timestamp": "2024-01-15T10:35:00Z", "output": { "inventory_path": ".code-review/inventory.json", "total_files": 247, "critical_files": 12, "high_files": 45 } } ``` ## Agent Instructions ```markdown You are in Phase 1 of the Code Review workflow. Your task is to discover and categorize code files. **Instructions**: 1. Use mcp__ccw_tools__smart_search with action="find_files" to discover files 2. Apply exclusion patterns from config or arguments 3. Categorize files by language, type, and priority 4. Extract basic metadata (LOC, complexity indicators) 5. Save inventory to .code-review/inventory.json 6. Generate summary report 7. Proceed to Phase 2 with critical + high priority files **Tools Available**: - mcp__ccw_tools__smart_search (file discovery) - Read (read configuration and sample files) - Write (save inventory and reports) **Output Requirements**: - inventory.json with complete file list and metadata - Summary markdown report - State file for phase tracking ``` ## Error Handling ### No Files Found ```javascript if (filteredFiles.length === 0) { throw new Error(`No files found matching scope: ${scope} Suggestions: - Check if scope pattern is correct - Verify exclude patterns are not too broad - Ensure project has code files in specified scope `); } ``` ### Large Codebase ```javascript if (filteredFiles.length > 1000) { console.warn(`⚠️ Large codebase detected (${filteredFiles.length} files)`); console.log(`Consider using --scope to review in batches`); // Offer to focus on critical/high priority only const answer = await askUser("Review critical/high priority files only?"); if (answer === 'yes') { filteredFiles = filteredFiles.filter(f => f.priority === 'critical' || f.priority === 'high' ); } } ``` ## Validation Before proceeding to Phase 2: - ✅ Inventory file created - ✅ At least one file categorized as critical or high priority - ✅ Metadata extracted for all files - ✅ Summary report generated - ✅ State saved for resume capability ## Next Phase **Phase 2: Security Analysis** - Analyze critical and high priority files for security vulnerabilities using OWASP Top 10 and CWE Top 25 checks.