Entropyk/_bmad-output/implementation-artifacts/4-1-solver-trait-abstraction.md

292 lines
13 KiB
Markdown

# Story 4.1: Solver Trait Abstraction
Status: done
<!-- Note: Validation is optional. Run validate-create-story for quality check before dev-story. -->
## Story
As a numerical developer,
I want a generic Solver trait,
so that strategies are interchangeable.
## Acceptance Criteria
1. **Solver Trait Defined** (AC: #1)
- Given a system of equations represented by `System`
- When implementing a solver strategy
- Then it must implement the common `Solver` trait
- And the trait provides `solve()` and `with_timeout()` methods
- And the trait is object-safe for dynamic dispatch
2. **Zero-Cost Abstraction via Enum Dispatch** (AC: #2)
- Given multiple solver strategies (Newton-Raphson, Sequential Substitution)
- When selecting a strategy at runtime
- Then an enum `SolverStrategy` dispatches to the correct implementation
- And there is no vtable overhead (monomorphization via enum)
- And the pattern matches the architecture decision for static polymorphism
3. **Timeout Support** (AC: #3)
- Given a solver with a configured timeout
- When the solver exceeds the time budget
- Then it stops immediately and returns `SolverError::Timeout`
- And the timeout is configurable via `with_timeout(Duration)`
4. **Error Handling** (AC: #4)
- Given a solver that fails to converge
- When checking the result
- Then it returns `SolverError::NonConvergence` with iteration count and final residual
- And all error variants are documented and follow the `thiserror` pattern
## Tasks / Subtasks
- [x] Define `Solver` trait in `crates/solver/src/solver.rs` (AC: #1)
- [x] Create new module `solver.rs` with `Solver` trait
- [x] Define `solve(&mut self, system: &mut System) -> Result<ConvergedState, SolverError>`
- [x] Define `with_timeout(self, timeout: Duration) -> Self` (builder pattern)
- [x] Ensure trait is object-safe (no generic methods, no `Self` in return types)
- [x] Add rustdoc with KaTeX equations for convergence criteria
- [x] Define `SolverError` enum (AC: #4)
- [x] Add `NonConvergence { iterations: usize, final_residual: f64 }`
- [x] Add `Timeout { timeout_ms: u64 }`
- [x] Add `Divergence { reason: String }`
- [x] Add `InvalidSystem { message: String }`
- [x] Use `thiserror::Error` derive
- [x] Define `ConvergedState` struct (AC: #1)
- [x] Store final state vector `Vec<f64>`
- [x] Store iteration count `usize`
- [x] Store final residual norm `f64`
- [x] Store convergence status `ConvergenceStatus` enum
- [x] Define `SolverStrategy` enum (AC: #2)
- [x] `NewtonRaphson(NewtonConfig)` variant
- [x] `SequentialSubstitution(PicardConfig)` variant
- [x] Implement `Solver` for `SolverStrategy` with match dispatch
- [x] Add `Default` impl returning Newton-Raphson
- [x] Define configuration structs (AC: #2)
- [x] `NewtonConfig` with max_iterations, tolerance, line_search flag
- [x] `PicardConfig` with max_iterations, tolerance, relaxation_factor
- [x] Both implement `Default` with sensible defaults (100 iterations, 1e-6 tolerance)
- [x] Add timeout infrastructure (AC: #3)
- [x] Add `timeout: Option<Duration>` to config structs
- [x] Add `with_timeout()` builder method
- [x] Note: Actual timeout enforcement will be in Story 4.2/4.3; this story only defines the API
- [x] Update `crates/solver/src/lib.rs` (AC: #1)
- [x] Add `pub mod solver;`
- [x] Re-export `Solver`, `SolverError`, `SolverStrategy`, `ConvergedState`
- [x] Re-export `NewtonConfig`, `PicardConfig`
- [x] Tests (AC: #1, #2, #3, #4)
- [x] Test `Solver` trait object safety (`Box<dyn Solver>` compiles)
- [x] Test `SolverStrategy::default()` returns Newton-Raphson
- [x] Test `with_timeout()` returns modified config
- [x] Test error variants have correct Display messages
- [x] Test `ConvergedState` fields are accessible
## Dev Notes
### Epic Context
**Epic 4: Intelligent Solver Engine** — Solve any system with < 1s guarantee, Newton-Raphson Sequential Substitution fallback.
**Story Dependencies:**
- Epic 1 (Component trait) done; `Component` trait provides `compute_residuals`, `jacobian_entries`
- Epic 2 (Fluid properties) done; fluid backends available
- Epic 3 (System topology) done; `System` struct with graph, state vector, residual/Jacobian assembly
- Story 4.2 (Newton-Raphson) will implement `NewtonRaphson` solver
- Story 4.3 (Sequential Substitution) will implement `SequentialSubstitution` solver
- Story 4.4 (Intelligent Fallback) will use `SolverStrategy` enum for auto-switching
**FRs covered:** FR14 (Newton-Raphson), FR15 (Sequential Substitution), FR17 (timeout), FR18 (best state on timeout)
### Architecture Context
**Technical Stack:**
- Rust, `thiserror` for error handling, `tracing` for observability
- No new external crates; use `std::time::Duration` for timeout
- `nalgebra` will be used in Story 4.2 for linear algebra (not this story)
**Code Structure:**
- `crates/solver/src/solver.rs` new file for `Solver` trait, `SolverError`, `SolverStrategy`, configs
- `crates/solver/src/lib.rs` re-exports
- `crates/solver/src/system.rs` existing `System` struct (no changes needed)
**Relevant Architecture Decisions:**
- **Solver Architecture:** Trait-based static polymorphism with enum dispatch [Source: architecture.md]
- **Zero-cost abstraction:** Enum dispatch avoids vtable overhead while allowing runtime selection
- **Error Handling:** Centralized error enum with `thiserror` [Source: architecture.md]
- **No panic policy:** All errors return `Result<T, SolverError>`
### Developer Context
**Existing Implementation:**
- **System struct** (`crates/solver/src/system.rs`):
- `compute_residuals(&self, state: &StateSlice, residuals: &mut ResidualVector)`
- `assemble_jacobian(&self, state: &StateSlice, jacobian: &mut JacobianBuilder)`
- `state_vector_len()`, `edge_count()`, `node_count()`
- `finalize()` must be called before solving
- **Component trait** (`crates/components/src/lib.rs`):
- `compute_residuals`, `jacobian_entries`, `n_equations`, `get_ports`
- Object-safe, used via `Box<dyn Component>`
- **JacobianBuilder** (`crates/components/src/lib.rs`):
- `add_entry(row, col, value)`, `entries()`, `clear()`
- **TopologyError** (`crates/solver/src/error.rs`):
- Pattern for error enum with `thiserror`
**Design Decisions:**
1. **Trait vs Enum:** The architecture specifies enum dispatch for zero-cost abstraction. The `Solver` trait defines the interface, and `SolverStrategy` enum provides the dispatch mechanism. Both are needed.
2. **Object Safety:** The `Solver` trait must be object-safe to allow `Box<dyn Solver>` for advanced use cases (e.g., user-provided custom solvers). This means:
- No generic methods
- No `Self` in return types
- No methods that take `self` by value (use `&self` or `&mut self`)
3. **Timeout API:** This story defines the `with_timeout()` API. Actual enforcement requires `std::time::Instant` checks in the solve loop (Story 4.2/4.3).
4. **ConvergedState vs SystemState:** `ConvergedState` is the result type returned by solvers, containing metadata. `SystemState` (alias for `Vec<f64>`) is the state vector used during solving.
### Technical Requirements
**Solver Trait:**
```rust
pub trait Solver {
/// Solve the system of equations.
///
/// # Errors
///
/// Returns `SolverError::NonConvergence` if max iterations exceeded.
/// Returns `SolverError::Timeout` if time budget exceeded.
fn solve(&mut self, system: &mut System) -> Result<ConvergedState, SolverError>;
/// Set a timeout for the solver.
///
/// If the solver exceeds this duration, it returns `SolverError::Timeout`.
fn with_timeout(self, timeout: Duration) -> Self;
}
```
**SolverStrategy Enum:**
```rust
pub enum SolverStrategy {
NewtonRaphson(NewtonConfig),
SequentialSubstitution(PicardConfig),
}
impl Solver for SolverStrategy {
fn solve(&mut self, system: &mut System) -> Result<ConvergedState, SolverError> {
match self {
Self::NewtonRaphson(cfg) => cfg.solve(system), // Story 4.2
Self::SequentialSubstitution(cfg) => cfg.solve(system), // Story 4.3
}
}
// ...
}
```
**Error Handling:**
- All errors use `thiserror::Error` derive
- Error messages are clear and actionable
- No panics in solver code paths
### Architecture Compliance
- **NewType pattern:** Use `Pressure`, `Temperature` from core where applicable (not directly in Solver trait, but in convergence criteria)
- **No bare f64** in public API where physical meaning exists
- **tracing:** Add `tracing::info!` for solver start/end, `tracing::debug!` for iterations
- **Result<T, E>:** All fallible operations return `Result`
- **approx:** Use for convergence checks in implementations (Story 4.2/4.3)
### Library/Framework Requirements
- **std::time::Duration** — timeout configuration
- **thiserror** — error enum derive (already in solver Cargo.toml)
- **tracing** — structured logging (already used in solver)
### File Structure Requirements
**New files:**
- `crates/solver/src/solver.rs` — Solver trait, SolverError, SolverStrategy, configs, ConvergedState
**Modified files:**
- `crates/solver/src/lib.rs` — add `pub mod solver;` and re-exports
**Tests:**
- Unit tests in `solver.rs` module (trait object safety, enum dispatch, error messages)
- Integration tests will be in Story 4.2/4.3 when actual solvers are implemented
### Testing Requirements
- **Trait object safety:** `let solver: Box<dyn Solver> = Box::new(NewtonConfig::default());` compiles
- **Enum dispatch:** `SolverStrategy::default().solve(&mut system)` dispatches correctly
- **Error Display:** All error variants have meaningful messages
- **Timeout builder:** `config.with_timeout(Duration::from_millis(100))` returns modified config
- **Default configs:** `NewtonConfig::default()` and `PicardConfig::default()` provide sensible values
### Previous Story Intelligence (3.5)
- Zero-flow regularization uses `MIN_MASS_FLOW_REGULARIZATION_KG_S` from core
- Components handle `OperationalState::Off` with zero mass flow
- Solver must handle systems with Off components (residuals/Jacobian already finite)
- Test pattern: `assert!(residuals.iter().all(|r| r.is_finite()))`
### Git Intelligence
Recent commits show:
- Epic 3 completion (multi-circuit, thermal coupling, zero-flow)
- Component framework mature (Compressor, HeatExchanger, Pipe, etc.)
- Fluid backends ready (CoolProp, Tabular, Cache)
- Ready for solver implementation
### Project Context Reference
- **FR14:** [Source: epics.md — System can solve equations using Newton-Raphson method]
- **FR15:** [Source: epics.md — System can solve equations using Sequential Substitution (Picard) method]
- **FR17:** [Source: epics.md — Solver respects configurable time budget (timeout)]
- **FR18:** [Source: epics.md — On timeout, solver returns best known state with NonConverged status]
- **Solver Architecture:** [Source: architecture.md — Trait-based static polymorphism with enum dispatch]
- **Error Handling:** [Source: architecture.md — Centralized error enum with thiserror]
- **Component Trait:** [Source: crates/components/src/lib.rs — Object-safe trait pattern]
### Story Completion Status
- **Status:** review
- **Completion note:** Story context created. Solver trait, SolverError, SolverStrategy enum, and config structs implemented. Actual solver algorithms in Stories 4.2 and 4.3.
## Change Log
- 2026-02-18: Story 4.1 created from create-story workflow. Epic 4 kickoff. Ready for dev.
- 2026-02-18: Story 4.1 implemented. All tasks complete. 74 tests pass (63 unit + 4 integration + 7 doc-tests). Status → review.
- 2026-02-18: Code review completed. Fixed: (1) doc-test `rust,ignore``rust,no_run` (now compiles), (2) added 3 dispatch tests for `SolverStrategy::solve()`, (3) fixed broken doc link in error.rs. 78 tests pass (66 unit + 4 integration + 8 doc-tests). Status → done.
## Dev Agent Record
### Agent Model Used
claude-sonnet-4-5 (via Cline)
### Debug Log References
- Doc-test failures on first run: `with_timeout` not in scope in doc examples. Fixed by adding `use entropyk_solver::solver::Solver;` to both doc examples.
### Completion Notes List
- ✅ Created `crates/solver/src/solver.rs` with `Solver` trait, `SolverError` (4 variants), `ConvergedState`, `ConvergenceStatus`, `SolverStrategy` enum, `NewtonConfig`, `PicardConfig`
-`Solver` trait is object-safe: `solve(&mut self, system: &mut System)` uses `&mut self`; `with_timeout` is `where Self: Sized` so it is excluded from the vtable
-`SolverStrategy` enum provides zero-cost static dispatch via `match` (no vtable)
-`SolverStrategy::default()` returns `NewtonRaphson(NewtonConfig::default())`
-`with_timeout()` builder pattern implemented on all three types (`NewtonConfig`, `PicardConfig`, `SolverStrategy`)
-`SolverError` uses `thiserror::Error` derive with 4 variants: `NonConvergence`, `Timeout`, `Divergence`, `InvalidSystem`
-`NewtonConfig` and `PicardConfig` stubs return `SolverError::InvalidSystem` (full implementation in Stories 4.2/4.3)
-`tracing::info!` added to solver dispatch and stub implementations
- ✅ Updated `crates/solver/src/lib.rs` with `pub mod solver;` and all re-exports
- ✅ 19 unit tests covering all ACs; 74 total tests pass with 0 regressions
- ✅ Code review: 3 issues fixed (doc-test, dispatch tests, doc link); 78 tests now pass
### File List
- `crates/solver/src/solver.rs` (new)
- `crates/solver/src/lib.rs` (modified)
- `crates/solver/src/error.rs` (modified — doc link fix)