Files
Claude-Code-Workflow/.codex/skill_lib/numerical-analysis-workflow/specs/quality-standards.md
catlog22 ab9b8ecbc0 Add comprehensive documentation for Numerical Analysis Workflow
- Introduced agent instruction template for task assignments in numerical analysis.
- Defined CSV schema for tasks, including input, computed, and output columns.
- Specified analysis dimensions across six phases of the workflow.
- Established phase topology for the diamond deep tree structure of the workflow.
- Outlined quality standards for assessing analysis reports, including criteria and quality gates.
2026-03-04 15:08:17 +08:00

6.9 KiB

Quality Standards for Numerical Analysis Workflow

Quality assessment criteria for NADW analysis reports.

When to Use

Phase Usage Section
Phase 2 (Execution) Guide agent analysis quality All dimensions
Phase 3 (Aggregation) Score generated reports Quality Gates

Quality Dimensions

1. Mathematical Rigor (30%)

Score Criteria
100% All formulas correct, properly derived, LaTeX well-formatted, error bounds proven
80% Formulas correct, some derivation steps skipped, bounds stated without full proof
60% Key formulas present, some notation inconsistencies, bounds estimated
40% Formulas incomplete or contain errors
0% No mathematical content

Checklist:

  • Governing equations identified and written in LaTeX
  • Weak forms correctly derived from strong forms
  • Convergence order stated with conditions
  • Error bounds provided (a priori or a posteriori)
  • CFL/stability conditions explicitly stated
  • Condition numbers estimated for key matrices
  • Complexity bounds (time and space) determined
  • LaTeX notation consistent throughout all documents

2. Code-Theory Mapping (25%)

Score Criteria
100% Every algorithm mapped to code with file:line references, data structures justified
80% Major algorithms mapped, most references accurate
60% Key mappings present, some code references missing
40% Superficial mapping, few code references
0% No code-theory connection

Checklist:

  • Each numerical method traced to implementing function/module
  • Data structures justified against algorithm requirements
  • Sparse matrix format matched to access patterns
  • Time integration scheme identified in code
  • Boundary condition implementation verified
  • Solver configuration traced to convergence requirements
  • Preconditioner choice justified

3. Numerical Quality Assessment (25%)

Score Criteria
100% Stability fully analyzed, precision risks cataloged, all edge cases covered
80% Stability assessed, major precision risks found, common edge cases covered
60% Basic stability check, some precision risks, incomplete edge cases
40% Superficial stability mention, few precision issues found
0% No numerical quality analysis

Checklist:

  • Condition numbers estimated for key operations
  • Catastrophic cancellation risks identified with file:line
  • Accumulation error potential assessed
  • Float precision choices justified (float32 vs float64)
  • Edge cases cataloged (singularities, degenerate inputs)
  • Overflow/underflow risks identified
  • Mixed-precision operations flagged

4. Cross-Phase Coherence (20%)

Score Criteria
100% All 6 phases connected, findings build on each other, no contradictions
80% Most phases connected, minor gaps in context propagation
60% Key connections present, some phases isolated
40% Limited cross-referencing between phases
0% Phases completely isolated

Checklist:

  • Wave 2 formulas reference Wave 1 governing equations
  • Wave 3 algorithms justified by Wave 2 theory
  • Wave 4 implementation verified against Wave 3 pseudocode
  • Wave 5 optimization targets from Wave 3 performance model
  • Wave 5 precision requirements from Wave 2/3 analysis
  • Wave 6 test plan covers findings from all prior waves
  • Wave 6 benchmarks compare against Wave 3 predictions
  • No contradictory findings between phases
  • Discoveries board used for cross-track sharing

Quality Gates (Per-Wave)

Wave Phase Gate Criteria Required Tracks
1 Global Survey Core model identified + architecture mapped + ≥1 KPI 3/3 completed
2 Theory Key formulas LaTeX'd + convergence stated + complexity determined 3/3 completed
3 Algorithm Pseudocode produced + stability assessed + performance predicted ≥2/3 completed
4 Module Code-algorithm mapping + data structures reviewed + APIs documented ≥2/3 completed
5 Local Hotspots identified + edge cases cataloged + precision risks flagged ≥2/3 completed
6 Integration Test plan complete + benchmarks planned + QA report synthesized 3/3 completed

Overall Quality Gates

Gate Threshold Action
PASS >= 80% across all dimensions Report ready for delivery
REVIEW 70-79% in any dimension Flag dimension for improvement, user decides
FAIL < 70% in any dimension Block delivery, identify gaps, suggest re-analysis

Issue Classification

Errors (Must Fix)

  • Missing governing equation identification (Wave 1)
  • LaTeX formulas with mathematical errors (Wave 2)
  • Algorithm pseudocode that doesn't match convergence requirements (Wave 3)
  • Code references to non-existent files/functions (Wave 4)
  • Unidentified catastrophic cancellation in critical path (Wave 5)
  • Test plan that doesn't cover identified stability issues (Wave 6)
  • Contradictory findings between phases
  • Missing context propagation (later phase ignores earlier findings)

Warnings (Should Fix)

  • Formulas without derivation steps
  • Convergence bounds stated without proof or reference
  • Missing edge case for known singularity
  • Performance model without memory bandwidth consideration
  • Data structure choice not justified
  • Test plan without manufactured solution verification
  • Benchmark without theoretical baseline comparison

Notes (Nice to Have)

  • Additional bibliography references
  • Alternative algorithm comparisons
  • Extended precision sensitivity analysis
  • Scaling prediction beyond current problem size
  • Code style or naming convention suggestions

Severity Levels for Findings

Severity Definition Example
Critical Incorrect results or numerical failure Wrong boundary condition → divergent solution
High Significant accuracy or performance degradation Condition number 10^15 → double precision insufficient
Medium Suboptimal but functional O(N^2) where O(N log N) is possible
Low Minor improvement opportunity Unnecessary array copy in non-critical path

Document Quality Metrics

Metric Target Measurement
Formula coverage ≥ 90% of core equations in LaTeX Count identified vs documented
Code reference density ≥ 1 file:line per finding Count references per finding
Cross-phase references ≥ 3 per document (Waves 3-6) Count cross-references
Severity distribution ≥ 1 per severity level Count per level
Discovery board contributions ≥ 2 per track Count NDJSON entries per worker
Perspective package Present in every document Boolean per document