Section D · Production

Data & Pipelines

On-chain data flows, indexers, off-chain monitoring, and where the protocol stops and the platform around it begins.

On-chain data flows

Protocol operators consume on-chain data through three primary channels:

Events. Emitted by contracts; cheap to read; queryable via JSON-RPC eth_getLogs or via a subgraph / indexer.
State reads. eth_call against the latest (or pinned) block; expensive at scale.
Trace reconstruction. Re-execute past blocks for fine-grained call-level data; very expensive, requires a full archive node.

Senior reflex: every state change in your protocol must emit an event with all relevant fields. The cost is ~375 gas per topic + 8 gas per byte of data. Skip events and your indexer needs to read storage at every block, which is orders of magnitude worse.

event Supply(
    Id indexed id,
    address indexed caller,
    address indexed onBehalf,
    uint256 assets,
    uint256 shares
);

// Indexed parameters become topics — searchable by EQ but limited to 3 indexed fields
// Non-indexed parameters are ABI-encoded into the data field.
emit Supply(id, msg.sender, onBehalf, assets, shares);

Subgraph design

A subgraph (The Graph) is a hosted, schema-driven indexer. You write three things: schema, manifest (which contracts and events to track), and mappings (AssemblyScript handlers that translate events to entities).

# schema.graphql
type Market @entity {
  id: Bytes!                  # market id
  loanToken: Bytes!
  collateralToken: Bytes!
  oracle: Bytes!
  irm: Bytes!
  lltv: BigInt!
  totalSupplyAssets: BigInt!
  totalBorrowAssets: BigInt!
  positions: [Position!]! @derivedFrom(field: "market")
}

type Position @entity {
  id: ID!                     # market.id + "-" + user
  market: Market!
  user: Bytes!
  supplyShares: BigInt!
  borrowShares: BigInt!
  collateral: BigInt!
}

type Liquidation @entity(immutable: true) {
  id: Bytes!
  market: Market!
  borrower: Bytes!
  liquidator: Bytes!
  seizedCollateral: BigInt!
  repaidAssets: BigInt!
  timestamp: BigInt!
}

# mapping (AssemblyScript)
export function handleSupply(event: SupplyEvent): void {
  let market = Market.load(event.params.id);
  if (market == null) return;
  market.totalSupplyAssets = market.totalSupplyAssets.plus(event.params.assets);
  market.save();

  let posId = event.params.id.toHexString() + "-" + event.params.onBehalf.toHexString();
  let pos = Position.load(posId);
  if (pos == null) {
    pos = new Position(posId);
    pos.market = event.params.id;
    pos.user = event.params.onBehalf;
    pos.supplyShares = BigInt.zero();
    pos.borrowShares = BigInt.zero();
    pos.collateral = BigInt.zero();
  }
  pos.supplyShares = pos.supplyShares.plus(event.params.shares);
  pos.save();
}

Things senior engineers think about:

Derived fields (@derivedFrom) save storage but cost query time. Use for one-to-many relationships.
Immutable entities (@entity(immutable: true)) are cheaper to write — use for append-only data like Liquidations, Trades.
BigInt / BigDecimal are unavoidable; never use number for token values.
Reorg handling is built in for subgraphs, but write idempotent handlers (don't add to derived fields that the subgraph already maintains).
Indexing performance. Avoid contract view calls inside mappings if possible — they slow indexing dramatically. Prefer emitting all needed data in events.

Indexers — the modern landscape

Tool	Shape	Strength	Weakness
The Graph (subgraph)	Hosted, AssemblyScript mappings	Battle-tested; large ecosystem	Slower indexing; cost on hosted plan
Goldsky	Hosted; subgraph + custom pipelines	Fast indexing; mirror to Postgres/S3	Vendor lock-in for advanced features
Envio	TypeScript/ReScript handlers; fast	Sub-second indexing; multi-chain native	Newer; smaller community
Ponder	TypeScript-first, dev-friendly	Modern DX; type-safe schema	Self-host or hosted
Custom indexer	Roll your own from eth_getLogs	Full control	Reorg handling, scaling, retries are your problem

Choice in practice

Default to a hosted subgraph (The Graph or Goldsky) for protocol-wide analytics. Reach for Envio / Ponder when you need sub-second freshness (liquidation alerting). Build custom only when your data needs are exotic (re-traces, specific call-level data).

Off-chain monitoring

The on-call rotation needs eyes on the protocol at all times. Real-time monitoring tools:

Tool	What it does
Tenderly	Tx simulation, alerts on events / function calls, mempool monitoring, debug traces
OpenZeppelin Defender	Sentinels (alerts), Autotasks (scheduled scripts), Relayers (signed-tx submission)
Phalcon (BlockSec)	Tx replay, security monitoring, exploit detection
Forta	Distributed detection bots; community + custom
Dune / Flipside	SQL on on-chain data; ad-hoc analytics dashboards
Custom (Subgraph + Slack)	Webhook on subgraph events; cheap and reliable

Typical alerts for a lending protocol:

Any liquidation event > $X notional → page on-call.
Any market crossing 95% utilization → page risk.
Oracle staleness > threshold → page protocol.
Any pause-modifier triggered → page everyone.
Any guardian-key signing event → notify security.
Any deviation between subgraph state and on-chain state > epsilon → page indexer ops.

Position-health monitoring

For liquidations, the protocol team typically runs (or relies on) services that:

Index every position and its current health factor.
Subscribe to oracle price updates.
For each price update, recompute the set of "now liquidatable" positions.
Submit liquidation transactions, optionally via a flash loan.

// Pseudocode for a liquidation bot loop
async function loop() {
  const positions = await subgraph.getOpenPositions();
  oracleClient.on("PriceUpdate", async (market, newPrice) => {
    const candidates = positions
      .filter(p => p.market === market)
      .filter(p => healthFactor(p, newPrice) < 1.0)
      .sort((a, b) => expectedProfit(b, newPrice) - expectedProfit(a, newPrice));

    for (const p of candidates) {
      const seize = optimalSeize(p, newPrice);
      const tx = await liquidator.populateTransaction.liquidate(p.borrower, seize, p.market);
      await flashbots.sendBundle([tx]);    // private mempool
    }
  });
}

Most protocol teams do not run the liquidator bots themselves — third-party MEV searchers do that. But the protocol team should monitor whether liquidations are happening promptly. If positions remain underwater for blocks without being liquidated, the liquidation incentive may be too low or the collateral too illiquid.

Oracle & liquidation monitoring

Specific dashboards the protocol team checks daily:

Oracle freshness. Heartbeat time since last update per feed. Heat-mapped.
Oracle deviation. Price delta between primary feed and a cross-check (e.g., TWAP, secondary).
L2 sequencer status. On Arbitrum, Optimism, Base — uptime via the chainlink sequencer-uptime feed.
Liquidation latency. Time between a position becoming unhealthy (oracle update) and the first liquidation tx landing.
Bad-debt watermark. Cumulative bad debt socialized across markets. Should be near-zero.
Utilization heat-map. Each market's utilization, color-coded.
Per-market PnL. Treasury fee accrual minus losses.

Where the off-chain stack meets on-chain reality

A few subtleties the senior engineer carries in their head:

Subgraph lag. Subgraphs are typically 1-3 blocks behind tip. For UI display this is fine; for liquidator decisions it is not. Use direct RPC reads for time-sensitive decisions.
Reorgs. An event you saw 2 blocks ago might be re-orged out. Subgraphs handle this transparently but custom indexers must wait for finality (often 12+ blocks on L1) before acting irreversibly.
Cross-chain consistency. A multi-chain protocol's state is N blockchains' states, asynchronous by design. Aggregate dashboards must clearly attribute per-chain.
RPC node trust. A compromised RPC provider can serve stale or fake data. For critical systems, query multiple providers and require quorum.
Tx submission. Public mempool is a goldfish bowl — every searcher sees your bundle. Private relays (Flashbots, MEV-Share, Beaverbuild) cost nothing extra and protect against frontrunning.

Interview moment

If asked "how would you monitor your protocol?" the senior answer has three layers: (1) events flow into an indexer, (2) dashboards/alerts on top of that indexer, (3) parallel direct RPC reads for time-critical checks. Acknowledge that the indexer can lag, and have a fallback. That answer is right for ~90% of protocols.