Section B · Technical Core

Error Handling & Failure Modes

A taxonomy of the bugs that kill lending protocols. For each: how it looks, how to detect, how to design it out.

A taxonomy of lending bugs

The post-mortems for lending protocols across the last few cycles cluster into a recognizable shape. A senior engineer treats this list as a reflex — every PR is read through it.

ClassSymptomSeverity
Bad-debt accrualProtocol becomes insolvent silentlyCritical
Share inflation (donation)First depositor steals from later depositorsHigh
Rounding in user favorSmall per-tx drain, eventually catastrophicHigh
Oracle stalenessStale price enables free borrow / wrong liquidationCritical
Reentrancy in callbacksState invariant violated during external callCritical
MEV / frontrun on liquidationLiquidators race to extract; protocol still ok but bad UXMedium
IRM extremesRate overflow / underflow at U near 0 or 100%High
Governance attackVoter manipulation; timelock-bypass; flash governanceCritical
ERC-20 quirksUSDT (no return value), fee-on-transfer, rebasingHigh
Read-only reentrancyStale view fools an external integratorHigh

Bad-debt accrual

What it looks like: a position falls underwater faster than liquidators can act; debt is greater than collateral × LLTV. The protocol now has more borrowed than backed.

Causes:

  • Oracle price gap (a discrete jump exceeds the safety margin between max borrow LTV and LLTV).
  • Collateral becomes illiquid (no one will repay debt to receive seizable collateral).
  • Liquidator MEV / gas spike makes liquidation unprofitable.
  • Oracle outage prevents liquidation.

Detection: off-chain monitoring of every position's health; an invariant test asserting sum-of-debts ≤ sum-of-collateral-values.

Design-out:

  • Choose a max borrow LTV strictly less than LLTV — the gap is the safety margin for price moves between updates.
  • Add pre-liquidations to start closing positions before LLTV.
  • Socialize bad debt immediately when collateral hits 0 (don't let it accumulate silently).
  • For high-vol collaterals, prefer Dutch-auction liquidations that adapt the incentive.

Share-inflation / the "donation" attack

A famous ERC-4626 / Compound v2 class of bug.

What it looks like:

  1. Vault is empty (totalAssets = 0, totalShares = 0).
  2. Attacker deposits 1 wei. Mints 1 share.
  3. Attacker transfers (donates) 1e18 of the underlying directly to the vault. totalAssets is now 1e18 + 1, totalShares is still 1.
  4. Victim deposits 1.5e18. With shares = assets × totalShares / totalAssets, victim gets 1.5e18 × 1 / (1e18 + 1) = 1 share (rounded down).
  5. Victim's deposit is split between attacker (now holds 50% of shares) and the donation pool. Attacker withdraws their 1 share and walks away with ~0.5x the victim's deposit.
// Mitigation 1: virtual shares (OpenZeppelin v5)
function _convertToShares(uint256 assets, Math.Rounding rounding) internal view returns (uint256) {
    return assets.mulDiv(totalSupply() + 10 ** _decimalsOffset(), totalAssets() + 1, rounding);
}

// Mitigation 2: dead shares — mint a chunk of shares to address(0) on first deposit
function _deposit(uint256 assets, address receiver) internal {
    uint256 shares = previewDeposit(assets);
    if (totalSupply() == 0) {
        // Send DEAD_SHARES to address(0), the rest to receiver
        _mint(address(0), DEAD_SHARES);
        shares -= DEAD_SHARES;
    }
    _mint(receiver, shares);
}

// Mitigation 3 (Morpho Blue): SHARES_OFFSET — shares are 1e6× the assets at parity
uint256 public constant SHARES_OFFSET = 1e6;
// First supply: 1 wei assets => 1e6 shares. Donating 1 wei doesn't move the ratio meaningfully.
Senior reflex

Any time you see shares = assets × totalShares / totalAssets, you should automatically check: what happens on first deposit? What if totalAssets is donated? If your mental model doesn't immediately give you the answer, you have a bug.

Rounding direction

Every share / asset conversion rounds some way. The rule is: always in the protocol's favor.

OperationRound sharesRound assets
Deposit (assets→shares)Down
Withdraw (shares→assets)Down
Mint (shares→assets)Up
Redeem (assets→shares)Up
Borrow (assets→shares)Up
Repay (shares→assets)Up

Mnemonic: "users always pay one wei more than they should; protocol always receives one wei more than it should." The result is monotonic and impossible to game.

Oracle staleness

Covered in depth in chapter 05; key bug class summary:

  • No staleness check: oracle returns a value but it's hours old. Borrower opens an undercollateralized position. Liquidatable when oracle updates, but protocol already lost.
  • Only check updatedAt, not answer: some feeds return 0 on error. Always check answer > 0 and updatedAt > block.timestamp - MAX_STALENESS.
  • Ignoring L2 sequencer: on L2, sequencer outages stall the price feed. Use Chainlink's L2 Sequencer Uptime feed and apply a grace period.
  • Trusting confidence intervals blindly: on Pyth, a wide conf means the publisher network is uncertain. Reject prices with conf / price > threshold.

Reentrancy in callbacks

Most modern lending bugs in this class are not classic reentrancy — those are caught by even basic review. The subtler ones:

  • Read-only reentrancy. Function A is mid-execution; it calls out to user X; user X calls a view function on another protocol that reads this contract's state via a getter. The getter returns stale state.
  • Cross-function reentrancy. Function A makes an external call. The recipient calls function B, which observes A's not-yet-committed state.
  • Callback-with-invariant-after. The pattern is sound when invariants are checked after the callback. The bug is when one branch (e.g., a fallback path) skips the post-callback invariant check.
// VULNERABLE — view returns mid-update state during a callback
contract Vault {
    uint256 public totalAssets;

    function withdraw(uint256 amount, bytes calldata data) external nonReentrant {
        totalAssets -= amount;             // updated
        asset.safeTransfer(msg.sender, amount);
        if (data.length != 0) ICallback(msg.sender).onWithdraw(data);
        // What if msg.sender is itself called by another protocol that reads totalAssets here?
        // Actually fine in this case — totalAssets is committed before the call.
    }

    function pricePerShare() external view returns (uint256) {
        return totalAssets * 1e18 / totalShares;   // could be read mid-callback
    }
}

The senior fix: any view function that other protocols might read as an oracle should either (a) revert during the callback window via a transient lock, or (b) be guaranteed-consistent.

MEV & frontrunning of liquidations

Not strictly a bug — but a systemic design concern. When a position becomes liquidatable, every searcher in the mempool races. The most common effects:

  • The liquidation incentive is competed away in priority fees; borrowers get punished, but searchers and validators capture the rents.
  • Sandwich attacks on the price update that makes a position liquidatable.
  • "Just-in-time" liquidity around oracles to maximize an arbitrage on a stale price.

Mitigations:

  • Dutch auctions price-discover the LI; the winning searcher pays exactly what they need to.
  • Pre-liquidations move part of the close to a curated keeper, off the public mempool.
  • Use commit-reveal or batch-auction primitives for price-sensitive transitions.
  • For borrowers, design pre-liquidations that opt-in to a smaller, predictable haircut.

IRM math at extreme utilization

Common failure modes:

  • U = 100%: a divide-by-zero if your formula divides by 1 - U. Cap U or guard the divide.
  • U > 100% (briefly during accrual): some accumulator updates can temporarily push U over 1 if borrow grows faster than supply. The IRM must not revert or produce nonsense in that window.
  • Rate overflow. An adaptive IRM that ratchets up unboundedly will eventually overflow. Cap MAX_BORROW_RATE and clamp.
  • Rate underflow / zero rate. Some PID controllers can drive rate to zero or negative if utilization is very low for very long. Floor at MIN_BORROW_RATE.
  • Compounding precision. Discrete compounding over short intervals with very high rates can produce noticeable rounding drift. Use Taylor expansions or rpow-style integer compounding with care.
// rpow — exponentiate a per-second rate over dt seconds using exponentiation-by-squaring
// MakerDAO-style. Be very careful with overflow on intermediates.
function rpow(uint256 x, uint256 n, uint256 base) internal pure returns (uint256 z) {
    assembly {
        switch x case 0 { switch n case 0 { z := base } default { z := 0 } }
        default {
            switch mod(n, 2) case 0 { z := base } default { z := x }
            let half := div(base, 2)
            for { n := div(n, 2) } n { n := div(n, 2) } {
                let xx := mul(x, x)
                if iszero(eq(div(xx, x), x)) { revert(0, 0) }
                let xxRound := add(xx, half)
                if lt(xxRound, xx) { revert(0, 0) }
                x := div(xxRound, base)
                if mod(n, 2) {
                    let zx := mul(z, x)
                    if and(iszero(iszero(x)), iszero(eq(div(zx, x), z))) { revert(0, 0) }
                    let zxRound := add(zx, half)
                    if lt(zxRound, zx) { revert(0, 0) }
                    z := div(zxRound, base)
                }
            }
        }
    }
}

Governance attacks

Five flavors:

  • Flash governance. Attacker borrows governance tokens, votes, executes a malicious proposal in one tx. Mitigation: snapshot at a past block, not at execution time.
  • Timelock bypass. A proposal that calls a privileged function which is not behind the timelock. Mitigation: every privileged function is behind the same timelock.
  • Voter bribery. Bribe pools (e.g., Hidden Hand-style markets) let attackers rent voting power. Mitigation: long lock-ups, vote-escrowed tokens, or move risk decisions out of governance.
  • Multisig compromise. The "guardian" or "owner" multisig is the highest-value target. Mitigation: hardware-backed signers, geographic separation, social-recovery.
  • Parameter-griefing. A proposal sets LLTV / IRM to malicious values. Mitigation: parameter allow-lists; immutable per-market parameters.

ERC-20 quirks

QuirkTokenFailure modeFix
No return valueUSDTtransfer().returns(bool) reverts on USDTUse SafeERC20 / safeTransfer
Fee-on-transferSome meme tokens, PAXGReceived amount < sent amountMeasure balance before/after; reject or handle
RebasingstETH, AMPLBalance changes without transferUse wrapped variant (wstETH); record shares
Blocklist / pausableUSDC, USDTTransfer reverts if sender/receiver is blockedHave a withdrawal recovery path
Approve raceOld ERC-20front-run an approveUse increaseAllowance / Permit2
PermitEIP-2612 tokensFrontrun-able permit signatures grief approvalsUse try/catch around permit
Decimals weirdnessUSDC (6), WBTC (8)Hardcoded 18 decimals → 1e12 errorsRead decimals() or use ORACLE_SCALE

A pre-merge checklist

Read every PR through this filter
  • Does every external entry call accrueInterest first?
  • Does every share/asset conversion round in protocol favor?
  • Does the health check happen after any callback?
  • Are oracle reads guarded by staleness + positivity?
  • Are all external transfers wrapped in SafeERC20?
  • Are state mutations done before external calls (CEI)?
  • Are events emitted for every state change (for indexers / audit trail)?
  • Do custom errors include the offending values (for debugging)?
  • Do new functions live behind the same authorization gates as siblings?
  • Are there fuzz / invariant tests covering the new state transitions?
  • Was forge inspect storage diffed against main?
  • Was forge snapshot diffed against main?