Signals
Model Instability

Multiple implementations of the same model produce inconsistent results, undermining confidence in reported figures.

Divergence in calculation outputs due to untracked changes in data, logic, or implementation. The numbers still come out. They just stop being trustworthy.

How it starts

A calculation is originally built by one team and validated against known inputs. Over time, a second team builds their own version because the original is too slow, too rigid, or inaccessible. A third team pulls results from both and averages them. Input data sources shift. A library is upgraded. A rounding convention changes. None of these changes are individually significant. Together, they produce divergence that surfaces as unexplained variance in reports.

What it looks like

Symptoms that indicate model instability is active.

  • Two reports covering the same portfolio show different P&L, risk, or exposure figures.
  • Frequent manual overrides are needed to reconcile outputs before they reach decision-makers.
  • Recalculation of a prior period produces different results without any known change in inputs.
  • Teams lack confidence in reported numbers and build shadow calculations to cross-check.
  • Model changes cannot be traced to specific commits, versions, or approval events.

Why it matters

When reported figures cannot be reproduced or explained, decision-makers lose confidence in the infrastructure. Capital allocation, hedging, and regulatory reporting all depend on numbers that are deterministic and auditable. Model instability erodes this foundation. In a stress event, it becomes impossible to distinguish between a real change in exposure and an artifact of calculation drift.

How we address it

We establish a single, versioned source of truth for calculation logic. Every model has an explicit definition, a version identifier, and a reproducible build. Inputs are captured at execution time so that any result can be replayed and explained. Validation layers compare outputs across versions and flag divergence before it reaches downstream consumers. The goal is not to prevent models from changing, it is to make every change visible, intentional, and traceable.

Where we've seen this

We encountered model instability across all three proof-of-work engagements. At BatteryOS, parallel implementations of the same optimization logic produced divergent results. In our ETRM infrastructure work, P&L attribution models had been forked across desks without version control. At Greenflash, calculation logic had drifted between real-time and batch paths. Our Calculations mandate exists because of how consistently this signal appears.