mirror of
https://github.com/catlog22/Claude-Code-Workflow.git
synced 2026-03-05 16:13:08 +08:00
- Introduced agent instruction template for task assignments in numerical analysis. - Defined CSV schema for tasks, including input, computed, and output columns. - Specified analysis dimensions across six phases of the workflow. - Established phase topology for the diamond deep tree structure of the workflow. - Outlined quality standards for assessing analysis reports, including criteria and quality gates.
6.9 KiB
6.9 KiB
Quality Standards for Numerical Analysis Workflow
Quality assessment criteria for NADW analysis reports.
When to Use
| Phase | Usage | Section |
|---|---|---|
| Phase 2 (Execution) | Guide agent analysis quality | All dimensions |
| Phase 3 (Aggregation) | Score generated reports | Quality Gates |
Quality Dimensions
1. Mathematical Rigor (30%)
| Score | Criteria |
|---|---|
| 100% | All formulas correct, properly derived, LaTeX well-formatted, error bounds proven |
| 80% | Formulas correct, some derivation steps skipped, bounds stated without full proof |
| 60% | Key formulas present, some notation inconsistencies, bounds estimated |
| 40% | Formulas incomplete or contain errors |
| 0% | No mathematical content |
Checklist:
- Governing equations identified and written in LaTeX
- Weak forms correctly derived from strong forms
- Convergence order stated with conditions
- Error bounds provided (a priori or a posteriori)
- CFL/stability conditions explicitly stated
- Condition numbers estimated for key matrices
- Complexity bounds (time and space) determined
- LaTeX notation consistent throughout all documents
2. Code-Theory Mapping (25%)
| Score | Criteria |
|---|---|
| 100% | Every algorithm mapped to code with file:line references, data structures justified |
| 80% | Major algorithms mapped, most references accurate |
| 60% | Key mappings present, some code references missing |
| 40% | Superficial mapping, few code references |
| 0% | No code-theory connection |
Checklist:
- Each numerical method traced to implementing function/module
- Data structures justified against algorithm requirements
- Sparse matrix format matched to access patterns
- Time integration scheme identified in code
- Boundary condition implementation verified
- Solver configuration traced to convergence requirements
- Preconditioner choice justified
3. Numerical Quality Assessment (25%)
| Score | Criteria |
|---|---|
| 100% | Stability fully analyzed, precision risks cataloged, all edge cases covered |
| 80% | Stability assessed, major precision risks found, common edge cases covered |
| 60% | Basic stability check, some precision risks, incomplete edge cases |
| 40% | Superficial stability mention, few precision issues found |
| 0% | No numerical quality analysis |
Checklist:
- Condition numbers estimated for key operations
- Catastrophic cancellation risks identified with file:line
- Accumulation error potential assessed
- Float precision choices justified (float32 vs float64)
- Edge cases cataloged (singularities, degenerate inputs)
- Overflow/underflow risks identified
- Mixed-precision operations flagged
4. Cross-Phase Coherence (20%)
| Score | Criteria |
|---|---|
| 100% | All 6 phases connected, findings build on each other, no contradictions |
| 80% | Most phases connected, minor gaps in context propagation |
| 60% | Key connections present, some phases isolated |
| 40% | Limited cross-referencing between phases |
| 0% | Phases completely isolated |
Checklist:
- Wave 2 formulas reference Wave 1 governing equations
- Wave 3 algorithms justified by Wave 2 theory
- Wave 4 implementation verified against Wave 3 pseudocode
- Wave 5 optimization targets from Wave 3 performance model
- Wave 5 precision requirements from Wave 2/3 analysis
- Wave 6 test plan covers findings from all prior waves
- Wave 6 benchmarks compare against Wave 3 predictions
- No contradictory findings between phases
- Discoveries board used for cross-track sharing
Quality Gates (Per-Wave)
| Wave | Phase | Gate Criteria | Required Tracks |
|---|---|---|---|
| 1 | Global Survey | Core model identified + architecture mapped + ≥1 KPI | 3/3 completed |
| 2 | Theory | Key formulas LaTeX'd + convergence stated + complexity determined | 3/3 completed |
| 3 | Algorithm | Pseudocode produced + stability assessed + performance predicted | ≥2/3 completed |
| 4 | Module | Code-algorithm mapping + data structures reviewed + APIs documented | ≥2/3 completed |
| 5 | Local | Hotspots identified + edge cases cataloged + precision risks flagged | ≥2/3 completed |
| 6 | Integration | Test plan complete + benchmarks planned + QA report synthesized | 3/3 completed |
Overall Quality Gates
| Gate | Threshold | Action |
|---|---|---|
| PASS | >= 80% across all dimensions | Report ready for delivery |
| REVIEW | 70-79% in any dimension | Flag dimension for improvement, user decides |
| FAIL | < 70% in any dimension | Block delivery, identify gaps, suggest re-analysis |
Issue Classification
Errors (Must Fix)
- Missing governing equation identification (Wave 1)
- LaTeX formulas with mathematical errors (Wave 2)
- Algorithm pseudocode that doesn't match convergence requirements (Wave 3)
- Code references to non-existent files/functions (Wave 4)
- Unidentified catastrophic cancellation in critical path (Wave 5)
- Test plan that doesn't cover identified stability issues (Wave 6)
- Contradictory findings between phases
- Missing context propagation (later phase ignores earlier findings)
Warnings (Should Fix)
- Formulas without derivation steps
- Convergence bounds stated without proof or reference
- Missing edge case for known singularity
- Performance model without memory bandwidth consideration
- Data structure choice not justified
- Test plan without manufactured solution verification
- Benchmark without theoretical baseline comparison
Notes (Nice to Have)
- Additional bibliography references
- Alternative algorithm comparisons
- Extended precision sensitivity analysis
- Scaling prediction beyond current problem size
- Code style or naming convention suggestions
Severity Levels for Findings
| Severity | Definition | Example |
|---|---|---|
| Critical | Incorrect results or numerical failure | Wrong boundary condition → divergent solution |
| High | Significant accuracy or performance degradation | Condition number 10^15 → double precision insufficient |
| Medium | Suboptimal but functional | O(N^2) where O(N log N) is possible |
| Low | Minor improvement opportunity | Unnecessary array copy in non-critical path |
Document Quality Metrics
| Metric | Target | Measurement |
|---|---|---|
| Formula coverage | ≥ 90% of core equations in LaTeX | Count identified vs documented |
| Code reference density | ≥ 1 file:line per finding | Count references per finding |
| Cross-phase references | ≥ 3 per document (Waves 3-6) | Count cross-references |
| Severity distribution | ≥ 1 per severity level | Count per level |
| Discovery board contributions | ≥ 2 per track | Count NDJSON entries per worker |
| Perspective package | Present in every document | Boolean per document |