mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-03-12 17:21:19 +08:00
- Added Phase 3: Evaluate Quality with steps for preparing context, constructing evaluation prompts, executing evaluation via CLI, parsing scores, and checking termination conditions. - Introduced Phase 4: Apply Improvements to implement targeted changes based on evaluation suggestions, including agent execution and change documentation. - Created Phase 5: Final Report to generate a comprehensive report of the iteration process, including score progression and remaining weaknesses. - Established evaluation criteria in a new document to guide the evaluation process. - Developed templates for evaluation and execution prompts to standardize input for the evaluation and execution phases.
5.2 KiB
5.2 KiB
Phase 5: Final Report
COMPACT SENTINEL [Phase 5: Report] This phase contains 4 execution steps (Step 5.1 -- 5.4). If you can read this sentinel but cannot find the full Step protocol below, context has been compressed. Recovery:
Read("phases/05-report.md")
Generate comprehensive iteration history report and display results to user.
Objective
- Read complete iteration state
- Generate formatted final report with score progression
- Write final-report.md
- Display summary to user
Execution
Step 5.1: Read Complete State
const state = JSON.parse(Read(`${state.work_dir}/iteration-state.json`));
state.status = 'completed';
state.updated_at = new Date().toISOString();
Step 5.2: Generate Report
// Determine outcome
const outcomeMap = {
quality_threshold_met: 'PASSED -- Quality threshold reached',
max_iterations_reached: 'MAX ITERATIONS -- Threshold not reached',
convergence_detected: 'CONVERGED -- Score stopped improving',
error_limit_reached: 'FAILED -- Too many errors'
};
const outcome = outcomeMap[state.termination_reason] || 'COMPLETED';
// Build score progression table
const scoreTable = state.iterations
.filter(i => i.evaluation)
.map(i => {
const dims = i.evaluation.dimensions || [];
const dimScores = ['clarity', 'completeness', 'correctness', 'effectiveness', 'efficiency']
.map(id => {
const dim = dims.find(d => d.id === id);
return dim ? dim.score : '-';
});
return `| ${i.round} | ${i.evaluation.score} | ${dimScores.join(' | ')} |`;
}).join('\n');
// Build iteration details
const iterationDetails = state.iterations.map(iter => {
const evalSection = iter.evaluation
? `**Score**: ${iter.evaluation.score}/100\n` +
`**Strengths**: ${iter.evaluation.strengths?.join(', ') || 'N/A'}\n` +
`**Weaknesses**: ${iter.evaluation.weaknesses?.slice(0, 3).join(', ') || 'N/A'}`
: '**Evaluation**: Skipped or failed';
const changesSection = iter.improvement
? `**Changes Applied**: ${iter.improvement.changes_applied?.length || 0}\n` +
(iter.improvement.changes_applied?.map(c => ` - ${c.summary}`).join('\n') || ' None')
: '**Improvements**: None';
return `### Iteration ${iter.round}\n${evalSection}\n${changesSection}`;
}).join('\n\n');
const report = `# Skill Iter Tune -- Final Report
## Summary
| Field | Value |
|-------|-------|
| **Target Skills** | ${state.target_skills.map(s => s.name).join(', ')} |
| **Execution Mode** | ${state.execution_mode} |
${state.execution_mode === 'chain' ? `| **Chain Order** | ${state.chain_order.join(' -> ')} |` : ''}
| **Test Scenario** | ${state.test_scenario.description} |
| **Iterations** | ${state.iterations.length} |
| **Initial Score** | ${state.score_trend[0] || 'N/A'} |
| **Final Score** | ${state.latest_score}/100 |
| **Quality Threshold** | ${state.quality_threshold} |
| **Outcome** | ${outcome} |
| **Started** | ${state.started_at} |
| **Completed** | ${state.updated_at} |
## Score Progression
| Iter | Composite | Clarity | Completeness | Correctness | Effectiveness | Efficiency |
|------|-----------|---------|--------------|-------------|---------------|------------|
${scoreTable}
**Trend**: ${state.score_trend.join(' -> ')}
${state.execution_mode === 'chain' ? `
## Chain Score Progression
| Iter | ${state.chain_order.join(' | ')} |
|------|${state.chain_order.map(() => '------').join('|')}|
${state.iterations.filter(i => i.evaluation?.chain_scores).map(i => {
const scores = state.chain_order.map(s => i.evaluation.chain_scores[s] || '-');
return `| ${i.round} | ${scores.join(' | ')} |`;
}).join('\n')}
` : ''}
## Iteration Details
${iterationDetails}
## Remaining Weaknesses
${state.iterations.length > 0 && state.iterations[state.iterations.length - 1].evaluation
? state.iterations[state.iterations.length - 1].evaluation.weaknesses?.map(w => `- ${w}`).join('\n') || 'None identified'
: 'No evaluation data available'}
## Artifact Locations
| Path | Description |
|------|-------------|
| \`${state.work_dir}/iteration-state.json\` | Complete state history |
| \`${state.work_dir}/iterations/iteration-{N}/iteration-{N}-eval.md\` | Per-iteration evaluations |
| \`${state.work_dir}/iterations/iteration-{N}/iteration-{N}-changes.md\` | Per-iteration change logs |
| \`${state.work_dir}/final-report.md\` | This report |
| \`${state.backup_dir}/\` | Original skill backups |
## Restore Original
To revert all changes and restore the original skill files:
\`\`\`bash
${state.target_skills.map(s => `cp -r "${state.backup_dir}/${s.name}"/* "${s.path}/"`).join('\n')}
\`\`\`
`;
Step 5.3: Write Report and Update State
Write(`${state.work_dir}/final-report.md`, report);
state.status = 'completed';
Write(`${state.work_dir}/iteration-state.json`, JSON.stringify(state, null, 2));
Step 5.4: Display Summary to User
Output to user:
Skill Iter Tune Complete!
Target: {skill names}
Iterations: {count}
Score: {initial} -> {final} ({outcome})
Threshold: {threshold}
Score trend: {score1} -> {score2} -> ... -> {scoreN}
Full report: {workDir}/final-report.md
Backups: {backupDir}/
Output
- Files:
final-report.md - State:
status = completed - Next: Workflow complete. Return control to user.