mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-03-05 16:13:08 +08:00
Backend:
- Fix readLegacyFiles to handle { name, prefix }[] role format
- Add roles backfill in getEffectiveTeamMeta when meta.json exists
- Ensure pipeline_stages and roles flow correctly to API response
Team Skills:
- Add pipeline metadata initialization to all 16 team skill coordinator roles
- Each skill now reports pipeline_stages and roles to meta.json at session init
Documentation:
- Update command references and component documentation
- Add numerical-analysis-workflow skill spec
- Sync zh/en translations for commands and components
6.9 KiB
6.9 KiB
Quality Standards for Numerical Analysis Workflow
Quality assessment criteria for NADW analysis reports.
When to Use
| Phase | Usage | Section |
|---|---|---|
| Phase 2 (Execution) | Guide agent analysis quality | All dimensions |
| Phase 3 (Aggregation) | Score generated reports | Quality Gates |
Quality Dimensions
1. Mathematical Rigor (30%)
| Score | Criteria |
|---|---|
| 100% | All formulas correct, properly derived, LaTeX well-formatted, error bounds proven |
| 80% | Formulas correct, some derivation steps skipped, bounds stated without full proof |
| 60% | Key formulas present, some notation inconsistencies, bounds estimated |
| 40% | Formulas incomplete or contain errors |
| 0% | No mathematical content |
Checklist:
- Governing equations identified and written in LaTeX
- Weak forms correctly derived from strong forms
- Convergence order stated with conditions
- Error bounds provided (a priori or a posteriori)
- CFL/stability conditions explicitly stated
- Condition numbers estimated for key matrices
- Complexity bounds (time and space) determined
- LaTeX notation consistent throughout all documents
2. Code-Theory Mapping (25%)
| Score | Criteria |
|---|---|
| 100% | Every algorithm mapped to code with file:line references, data structures justified |
| 80% | Major algorithms mapped, most references accurate |
| 60% | Key mappings present, some code references missing |
| 40% | Superficial mapping, few code references |
| 0% | No code-theory connection |
Checklist:
- Each numerical method traced to implementing function/module
- Data structures justified against algorithm requirements
- Sparse matrix format matched to access patterns
- Time integration scheme identified in code
- Boundary condition implementation verified
- Solver configuration traced to convergence requirements
- Preconditioner choice justified
3. Numerical Quality Assessment (25%)
| Score | Criteria |
|---|---|
| 100% | Stability fully analyzed, precision risks cataloged, all edge cases covered |
| 80% | Stability assessed, major precision risks found, common edge cases covered |
| 60% | Basic stability check, some precision risks, incomplete edge cases |
| 40% | Superficial stability mention, few precision issues found |
| 0% | No numerical quality analysis |
Checklist:
- Condition numbers estimated for key operations
- Catastrophic cancellation risks identified with file:line
- Accumulation error potential assessed
- Float precision choices justified (float32 vs float64)
- Edge cases cataloged (singularities, degenerate inputs)
- Overflow/underflow risks identified
- Mixed-precision operations flagged
4. Cross-Phase Coherence (20%)
| Score | Criteria |
|---|---|
| 100% | All 6 phases connected, findings build on each other, no contradictions |
| 80% | Most phases connected, minor gaps in context propagation |
| 60% | Key connections present, some phases isolated |
| 40% | Limited cross-referencing between phases |
| 0% | Phases completely isolated |
Checklist:
- Wave 2 formulas reference Wave 1 governing equations
- Wave 3 algorithms justified by Wave 2 theory
- Wave 4 implementation verified against Wave 3 pseudocode
- Wave 5 optimization targets from Wave 3 performance model
- Wave 5 precision requirements from Wave 2/3 analysis
- Wave 6 test plan covers findings from all prior waves
- Wave 6 benchmarks compare against Wave 3 predictions
- No contradictory findings between phases
- Discoveries board used for cross-track sharing
Quality Gates (Per-Wave)
| Wave | Phase | Gate Criteria | Required Tracks |
|---|---|---|---|
| 1 | Global Survey | Core model identified + architecture mapped + ≥1 KPI | 3/3 completed |
| 2 | Theory | Key formulas LaTeX'd + convergence stated + complexity determined | 3/3 completed |
| 3 | Algorithm | Pseudocode produced + stability assessed + performance predicted | ≥2/3 completed |
| 4 | Module | Code-algorithm mapping + data structures reviewed + APIs documented | ≥2/3 completed |
| 5 | Local | Hotspots identified + edge cases cataloged + precision risks flagged | ≥2/3 completed |
| 6 | Integration | Test plan complete + benchmarks planned + QA report synthesized | 3/3 completed |
Overall Quality Gates
| Gate | Threshold | Action |
|---|---|---|
| PASS | >= 80% across all dimensions | Report ready for delivery |
| REVIEW | 70-79% in any dimension | Flag dimension for improvement, user decides |
| FAIL | < 70% in any dimension | Block delivery, identify gaps, suggest re-analysis |
Issue Classification
Errors (Must Fix)
- Missing governing equation identification (Wave 1)
- LaTeX formulas with mathematical errors (Wave 2)
- Algorithm pseudocode that doesn't match convergence requirements (Wave 3)
- Code references to non-existent files/functions (Wave 4)
- Unidentified catastrophic cancellation in critical path (Wave 5)
- Test plan that doesn't cover identified stability issues (Wave 6)
- Contradictory findings between phases
- Missing context propagation (later phase ignores earlier findings)
Warnings (Should Fix)
- Formulas without derivation steps
- Convergence bounds stated without proof or reference
- Missing edge case for known singularity
- Performance model without memory bandwidth consideration
- Data structure choice not justified
- Test plan without manufactured solution verification
- Benchmark without theoretical baseline comparison
Notes (Nice to Have)
- Additional bibliography references
- Alternative algorithm comparisons
- Extended precision sensitivity analysis
- Scaling prediction beyond current problem size
- Code style or naming convention suggestions
Severity Levels for Findings
| Severity | Definition | Example |
|---|---|---|
| Critical | Incorrect results or numerical failure | Wrong boundary condition → divergent solution |
| High | Significant accuracy or performance degradation | Condition number 10^15 → double precision insufficient |
| Medium | Suboptimal but functional | O(N^2) where O(N log N) is possible |
| Low | Minor improvement opportunity | Unnecessary array copy in non-critical path |
Document Quality Metrics
| Metric | Target | Measurement |
|---|---|---|
| Formula coverage | ≥ 90% of core equations in LaTeX | Count identified vs documented |
| Code reference density | ≥ 1 file:line per finding | Count references per finding |
| Cross-phase references | ≥ 3 per document (Waves 3-6) | Count cross-references |
| Severity distribution | ≥ 1 per severity level | Count per level |
| Discovery board contributions | ≥ 2 per track | Count NDJSON entries per worker |
| Perspective package | Present in every document | Boolean per document |