Evaluation & Quality
How senior engineers prove that math-heavy contracts are correct. Fuzz, invariants, differential checks against Python references, fixed-point tolerances, and the lemma that runs through everything: always round in the protocol's favor.
Testing math-heavy contracts
You can't unit-test a CFMM into correctness. There are too many states. The discipline is layered:
- Unit tests for individual helpers (TickMath, FullMath, SqrtPriceMath).
- Property-based fuzz for individual math operations — "for any (a, b, c), mulDiv(a, b, c) equals floor(a·b/c)."
- Stateful invariant tests at the pool level — "after any sequence of swaps and LP actions, K never decreases except via fee mint."
- Differential tests against a Python or Vyper reference.
- Symbolic checks for short, gnarly functions (Halmos / hevm).
- Echidna on the assembled system.
- Mainnet fork tests against real tokens to catch real-world quirks.
Foundry fuzz + invariant testing
Foundry's invariant runner is the gold standard. Two concepts:
- Target selectors — restrict the runner to specific functions on specific contracts. Without this, it tries to call
setOwnerand trivially breaks invariants. - Ghost variables — counters / accumulators maintained by the test handler alongside the system. Lets you assert "the sum of user deposits equals the pool's TVL."
// Handler — restricts what the fuzzer can call and tracks ghost state
contract PoolHandler is Test {
Pool public pool;
uint256 public ghostInTotal;
uint256 public ghostOutTotal;
constructor(Pool _p) { pool = _p; }
function swap0For1(uint256 amountInSeed) external {
uint256 amountIn = bound(amountInSeed, 1, 1e24);
deal(address(token0), msg.sender, amountIn);
token0.approve(address(pool), amountIn);
uint256 out = pool.swap(amountIn, 0, msg.sender);
ghostInTotal += amountIn;
ghostOutTotal += out;
}
}
// Invariant test
contract PoolInvariants is Test {
Pool pool;
PoolHandler handler;
function setUp() public { /* deploy, fund, wire handler */ }
function invariant_K_only_increases() public {
uint256 k = pool.reserve0() * pool.reserve1();
assertGe(k, lastSeenK); // tracked across runs
}
}Foundry config knobs to know:
[invariant]
runs = 1000
depth = 50
fail_on_revert = false
call_override = false
depth is the number of calls per run. Bigger = more thorough = slower. Real DEX codebases run nightly with runs = 100_000, depth = 200.
Differential testing
The single most powerful technique for math contracts: implement the algorithm twice (once in Python, once in Solidity) and assert agreement under fuzz.
function testFuzz_getAmountOut_matchesPython(uint256 in_, uint256 rIn, uint256 rOut) public {
in_ = bound(in_, 1, 1e24);
rIn = bound(rIn, 1e6, 1e30);
rOut = bound(rOut, 1e6, 1e30);
uint256 solOut = library_.getAmountOut(in_, rIn, rOut);
string[] memory cmds = new string[](5);
cmds[0] = "python3";
cmds[1] = "test/refs/get_amount_out.py";
cmds[2] = vm.toString(in_);
cmds[3] = vm.toString(rIn);
cmds[4] = vm.toString(rOut);
bytes memory raw = vm.ffi(cmds);
uint256 pyOut = abi.decode(raw, (uint256));
assertEq(solOut, pyOut); // exact, no eps — CPMM is integer math
}For Newton-based math (StableSwap), you'll need a tolerance — typically 1 wei, or 1 ULP, depending on the operation. Document the tolerance and justify it.
Fixed-point assertion patterns
Assertions in fixed-point math need eps tolerances. Common patterns:
// Absolute tolerance — assert within `eps` wei
function assertApproxAbs(uint256 a, uint256 b, uint256 eps) internal {
uint256 diff = a > b ? a - b : b - a;
assertLe(diff, eps, "approxAbs failed");
}
// Relative tolerance — assert within `bps` basis points
function assertApproxRel(uint256 a, uint256 b, uint256 bps) internal {
if (b == 0) { assertEq(a, 0); return; }
uint256 diff = a > b ? a - b : b - a;
assertLe(diff * 10_000 / b, bps, "approxRel failed");
}
// PRBMath helper — for SD59x18 / UD60x18 values
function assertApproxFixed(UD60x18 a, UD60x18 b, UD60x18 eps) internal {
UD60x18 diff = a.gt(b) ? a.sub(b) : b.sub(a);
assertTrue(diff.lte(eps));
}Foundry's std-assertions ships assertApproxEqAbs and assertApproxEqRel — use those by default; write your own only when the operation needs special tolerance accounting.
Symbolic testing with Halmos
Halmos (a16z/crytic) is a symbolic execution engine that runs your Foundry tests with symbolic inputs instead of concrete fuzz seeds. It either proves your assertion holds for all inputs or produces a counterexample.
When to use:
- Small, gnarly pure functions (sqrt, tick conversions, fee splits).
- Properties you suspect are subtly false at the edges.
- Not for full pool state — too many paths.
halmos --function "check_" --solver-timeout-assertion 60000By convention, Halmos picks up functions prefixed check_. The same code Foundry runs as a fuzz test, Halmos can run as a proof attempt.
Echidna
Trail of Bits' coverage-guided property fuzzer. Two big advantages over Foundry's:
- Coverage-guided — explores new branches more aggressively.
- Multi-actor — first-class support for multiple
msg.senders, which matters for AMM scenarios.
Mature DEX projects often run both — Foundry for development feedback loop, Echidna for overnight property sweeps.
The "always round in protocol favor" lemma
The single most important rule in DEX math. Every division could round up or down by one ULP. The protocol must round the way that can't be exploited.
| Operation | Round which way | Why |
|---|---|---|
| Compute amount out (given amount in) | Down | User gets at most the formula result; never more |
| Compute amount in (given amount out) | Up | User pays at least the formula result; never less |
| Mint LP shares (deposit) | Down | Depositor gets at most the formula's worth of shares |
| Burn LP shares (withdraw) | Down | Withdrawer gets at most the formula's worth of assets |
| Fee accrual to position | Down | Position owner can claim no more than earned |
| Liquidation collateral seize | Up (in favor of the protocol seizing more) | Borrower can't escape with extra collateral |
Test this explicitly:
function invariant_outputRoundedDown() public {
uint256 idealOut = (amountIn * 997 * reserveOut) / ((reserveIn * 1000) + (amountIn * 997));
assertLe(actualOut, idealOut); // actual must NEVER exceed ideal
}Property-based math invariants
The invariants every CFMM should pass:
- K never decreases (or its analogue — D for StableSwap, the v3 invariant within range).
- Swap monotonicity — larger input ⇒ larger or equal output (no negative slippage).
- No-arbitrage — round-tripping (swap A→B then B→A) loses value equal to fees + rounding, never gains.
- Conservation — sum of all LP claims + protocol fee accruals ≤ pool's actual token balances.
- Position fees converge — for a position that hasn't been touched, claimable fees only grow (or stay flat on out-of-range periods).
- Solvency — total LP-redeemable value ≤ pool TVL at all times.
- Tick crossing correctness — fees-owed and feesGrowthInside accounting net to zero across the full price range.
Write an invariant test that does N random swaps in both directions and asserts that the second leg of an A→B→A round-trip produces strictly less than the original input. That single invariant catches most fee-on-transfer integration bugs.