Section D · Production

Data & Pipelines

On-chain data is half the user-visible product. Subgraphs, indexers, event design, and the monitoring stack a senior DEX engineer designs alongside the contracts themselves.

Why smart-contract engineers care about data

Three reasons:

  1. Event design is irreversible. Once shipped on immutable core, events are forever. Bad event design means bad indexing means bad UX.
  2. You will be on call for monitoring incidents. Knowing the data pipeline matters when something looks wrong at 3am.
  3. Front-ends and aggregators consume your event schema. Backwards-incompatible event changes break the ecosystem.

Subgraph design

The Graph is the canonical indexing layer. A subgraph defines entities (Pool, Position, Swap, etc.) and event handlers that mutate them.

Entity shape for a typical DEX subgraph:

# schema.graphql
type Factory @entity {
  id: ID!
  poolCount: BigInt!
  totalVolumeUSD: BigDecimal!
}

type Pool @entity {
  id: ID!                           # pool address
  token0: Token!
  token1: Token!
  feeTier: BigInt!
  liquidity: BigInt!
  sqrtPrice: BigInt!
  tick: BigInt
  volumeUSD: BigDecimal!
  totalValueLockedUSD: BigDecimal!
  feesUSD: BigDecimal!
}

type Position @entity {
  id: ID!                           # tokenId or composite
  owner: Bytes!
  pool: Pool!
  tickLower: BigInt!
  tickUpper: BigInt!
  liquidity: BigInt!
  collectedFeesToken0: BigDecimal!
  collectedFeesToken1: BigDecimal!
}

type Swap @entity {
  id: ID!                           # tx hash + log index
  pool: Pool!
  sender: Bytes!
  recipient: Bytes!
  amount0: BigDecimal!
  amount1: BigDecimal!
  amountUSD: BigDecimal!
  sqrtPriceX96: BigInt!
  tick: BigInt!
  timestamp: BigInt!
}

Design rules:

  • Entity IDs must be deterministic. Don't use sequential counters — they break parallel indexers. Use addresses, txhash+logIndex, or composite keys.
  • Denormalize for query speed. Store USD values at write time; don't compute at read time.
  • Snapshot at intervals. Hourly and daily aggregates as separate entities. Don't query 1M swaps to compute volume.
  • Handle reorgs. The Graph does this for you up to a depth; don't make state changes the subgraph can't unwind.

Event design at the core

Every state change on a core contract should emit an event. The schema should:

  1. Index the searchable fields. indexed on user addresses, token addresses, pool IDs.
  2. Pack the rest as data. Up to 3 indexed args; the rest goes in data.
  3. Carry derived values when cheap. Emit the new sqrtPriceX96 and tick on every swap so indexers don't have to recompute.
  4. Be backwards-compatible across upgrades. Once shipped, don't change. Add new events; don't modify old ones.
// Canonical v3 swap event — note the careful index choice and the post-swap state
event Swap(
    address indexed sender,
    address indexed recipient,
    int256  amount0,
    int256  amount1,
    uint160 sqrtPriceX96,    // POST-swap
    uint128 liquidity,       // POST-swap
    int24   tick             // POST-swap
);

Why post-swap state? Because a subgraph that consumes this event in order can reconstruct the entire state of the pool from genesis without ever reading storage.

Volume / TVL / fee accounting

The three numbers every DEX dashboard shows:

MetricHow it's actually computed
VolumeSum of abs(amount0) (or amount1) on Swap events, converted to USD. Done daily/hourly.
TVLSum of reserves across all pools × USD price per token. v3 needs LP positions aggregated; v2 just reads token.balanceOf(pool).
FeesVolume × fee tier, minus protocol fee. Per-pool.

Pricing tokens is its own problem. Strategies:

  • Find a path to USDC/USDT/DAI on the same chain; quote through that.
  • Use a TWAP from the highest-TVL pool involving the token.
  • Whitelist a stable set; price everything by routes to that set.
  • For long-tail tokens — accept that USD valuation is fuzzy.

Off-chain monitoring

What you actually watch in prod:

  • Position liquidity drift. If a known whale's position changes outside expected windows, alert.
  • Abnormal slippage events. A swap that consumed 10× the expected slippage suggests a thin pool or an attack.
  • Fee accumulation health. Fees should grow roughly with volume. A divergence means math is broken or an integrator is gaming.
  • Protocol-fee invariants. Treasury accruals match expected % of volume.
  • Oracle staleness. Last TWAP update vs current time. If observations stop, the oracle has frozen.
  • Hook failures (v4). Reverts inside hooks have downstream effects — alert.
  • L1 ↔ L2 deployment parity. Bytecode and selector tables should match across chains for the canonical deployments.

Tools: Tenderly Alerts, OpenZeppelin Defender, Hypernative, Forta, plus in-house Prometheus scrapers fed by a custom indexer.

Aggregator integration

Aggregators consume your contracts. They expect:

  • Reliable quoter contracts. Off-chain pricing requires a view function that returns the exact swap result without execution.
  • Stable function signatures. The aggregator's integration breaks the day you ship a new selector.
  • Callback-friendly interfaces. Most aggregators call core directly via callbacks; periphery is bypassed.
  • Subgraph or REST availability. They list-and-rank pools.
AggregatorIntegration shape
1inchPathfinder; off-chain routing; on-chain settler. Direct pool calls.
0xRFQ + AMM hybrid. Settler contract per chain.
ParaSwapAdapters per DEX type; routes can split across many pools.
KyberSwapMeta-aggregator with own AMM pools as fallback.
CoW SwapBatch auction; solvers compete; settles via Vault contract.
OdosSOR (smart order router); split path optimization.
LiFi / Socket / SquidCross-chain DEX aggregators; consume yours per-chain.

Dune / Flipside / on-chain SQL

Mature engineers can write basic Dune queries to investigate incidents. Example: find the top 10 swappers in the last 24h:

-- Dune SQL (Trino dialect)
SELECT
  "from" AS swapper,
  SUM(amount_usd) AS total_volume_usd
FROM uniswap_v3_ethereum.Pool_evt_Swap s
JOIN ethereum.transactions t ON s.evt_tx_hash = t.hash
WHERE s.evt_block_time >= now() - interval '24' hour
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10

You're not expected to be a data engineer. But knowing how to pull a quick sanity check from chain data is a senior signal.

Observability checklist

For a new core deployment, ensure all of:

  • Subgraph spec drafted alongside contract spec.
  • Events designed before the first PR.
  • Indexer running on testnet before mainnet deploy.
  • Volume + TVL dashboards available at launch.
  • Alert rules wired to PagerDuty / Slack.
  • Anomaly thresholds set conservatively for the first month.
  • Runbooks written for the top 5 expected alert types.
Senior signal

When asked "how would you launch a new core release," a complete answer includes the data and monitoring pipeline, not just the contracts. Most candidates omit it.