Section A · Orient · Read first

Start Here

Interview prep for senior/staff Data Scientist loops where the work is shipping production models end-to-end — calibrated against two specific roles, one in fraud/identity and one in multimodal sensor AI.

The two roles, in plain English

This guide is calibrated against two open Data Scientist roles that live on the same spectrum but at different points:

Staff Data Scientist, Full Stack at a fraud / identity-verification platform (senior, 6+ yrs with PhD or 8+ yrs with Masters). Build fraud models end-to-end: data acquisition → featurization → labeling → training → experimentation → production → monitoring. Python 3, PostgreSQL, AWS (EC2, S3, RDS, Redshift). Such JDs explicitly say "unusual insights drive competitive advantage rather than optimization of new machine learning methodologies" — meaning: deep domain knowledge and inventive features beat sklearn novelty.
a multimodal-sensor AI company — Data Scientist (San Mateo, on-site, 4+ yrs). Multimodal sensor-fusion platform — "Newton," a real-time multimodal LLM for physical-world AI. The role sits in GTM / Solutions Engineering: review customer assets (video, sensor streams), prepare time-series datasets, design prompts and n-shot examples, configure lens parameters, support Solutions Architects through customer evaluations. Python, Jupyter, signal processing, Encord-style labeling.

The Staff-DS role is "staff IC who owns a model domain end-to-end." The Solutions-Engineering DS role is "mid-level applied scientist who turns customer data into a working POC." Different seniorities, both full-stack applied: both expect you to go from raw data to a shipped artifact without a separate ML engineer to hand off to.

The convergence

Neither role rewards ML novelty for its own sake. Such JDs want "deep domain understanding drives development"; Such JDs want "iterative preprocessing cycles with evaluation and refinement". Both want someone who can be dropped on a messy data problem and produce a defensible, testable artifact without waiting for direction.

What the rounds typically test

Loops for full-stack / applied DS roles usually include:

ML fundamentals — supervised models for tabular data, evaluation metrics (especially under class imbalance), calibration, model selection. Expect "given these tradeoffs, which model?" framing, not whiteboarding gradient descent.
Feature engineering deep-dive — almost guaranteed at the fraud/identity company. Often a take-home or live session: "here's a fraud-ish dataset, what features do you build and why?" Domain-flavored at every step.
Production / code quality — write production-ready Python with tests. Staff-DS JDs explicitly say "production code that can be relied on for real-time decision making." Solutions-Engineering JDs imply it via "executable iterative preprocessing cycles."
Applied design — "Customer brings X video stream / fraud signal — walk us through how you'd build a model / prompt / pipeline." End-to-end ownership signal.
Coding — Python data manipulation, light algorithms, occasionally a SQL screen.
Behavioral — "tell me about a project you owned end-to-end," "describe a time domain knowledge changed your modeling approach."

This guide covers all six.

The folder, in reading order

The numbering follows the order you should read in. Five sections:

Section A — Orient (read first)

File	Why
01-the-roles	Decode what each role actually involves — the two role archetypes side by side
02-positioning-from-scratch	Mindset before content — leveraging adjacent experience without overclaiming

Section B — Core DS concepts (the technical core)

File	Why
03-ml-fundamentals	Supervised models for tabular data; staff-level model selection and evaluation
04-feature-engineering	"Inventive feature engineering" demystified — entity features, leakage, drift, domain insight
05-fraud-and-imbalanced-data	Class imbalance, cost-sensitive learning, threshold tuning, delayed labels. fraud-domain-flavored.
06-prompt-engineering-applied	N-shot, prompt templates, lens configuration, prompt eval. Solutions-Engineering-flavored.
07-time-series-and-signals	Signal processing, time-series features, sensor data prep, video alignment
08-data-pipelines-applied	Data acquisition, labeling workflows, iterative preprocessing, feature pipelines
09-production-ml	Writing production code, tests, real-time inference, monitoring, retraining cadence

Section C — Coding (DSA)

File	Why
10-coding-fundamentals	Python patterns for full-stack DS — testability, vectorization, generators
11-coding-problems	Drillable Python problems with feature-engineering and evaluation flavor

Section D — Production / Cloud

File	Why
12-aws-data-stack	EC2, S3, RDS, Redshift, PostgreSQL — the standard fraud-stack and how to discuss it
13-mlops-applied	CI/CD for models, shadow deploys, canaries, drift monitoring, retraining

Section E — Reference & Execution

File	Why
14-domain-context	Fraud & identity and multimodal sensor AI vocabulary
15-interview-questions	~30 Q&A drill set with hide-show answers
16-day-of	Structural moves, recovery patterns, closing statement. Reread morning of.

Study schedule

If you have 7+ days

Day 1: 01, 02 (orient) → 03 (ML fundamentals)
Day 2: 04 (features), 05 (fraud/imbalanced)
Day 3: 06 (prompts), 07 (time-series & signals)
Day 4: 08 (pipelines), 09 (production ML)
Day 5: 10, 11 (coding drills on a timer)
Day 6: 12, 13 (AWS, MLOps), 14 (domain context)
Day 7: Drill 15. Reread 16. Sleep.

If you have 2–3 days

01, 02, 03 (ML), 04 (features), role-relevant 05 or 06, 09 (production ML), 11 (drill 4–5 problems), 15 (drill), 16. Skim everything else.

If you have < 24 hours

01, 02, 04 (features — central to both roles), 05 or 06 (whichever applies), 09 (production ML), 15 (drill all questions), 16. Skim 03, 07, 13 headings only.

The single most important reframe

Read this twice

Neither of these companies is hiring a researcher chasing SOTA. They're hiring someone who can extract competitive advantage from data and domain understanding and ship it as production code. Such JDs put it plainly: "unusual insights drive our competitive advantage rather than optimization of new machine learning methodologies."

What this means tactically:

When you get a modeling prompt, your first move is domain questions, not algorithm questions. "What does fraud actually look like in this data? Who labels it? How fresh are the labels? What's the cost of a false positive vs false negative for the customer?" That's how you signal you'll thrive here.
When you describe past work, lead with features and data decisions, not models. "I noticed that X co-occurred with Y in 90% of fraud cases — built a feature for that — moved AUC from 0.82 to 0.88." That's the staff-bar version of impact in this role family.
When you discuss production, treat it as table stakes. Tests, monitoring, drift detection, retraining cadence — these are part of the deliverable, not the SRE's problem.
When you get a vague open-ended prompt ("here's a sensor stream, customer wants to detect X"), don't reach for fancy models. Reach for iteration loops — what's the smallest preprocessing + model + eval you can ship today, and how would you improve it tomorrow?

What winning looks like

You don't need to be the deepest researcher in the loop. You need to be the candidate who:

Picks the right model for the job in under 30 seconds, and can articulate why (calibration vs ranking, latency budget, label availability, drift profile).
Builds features that encode domain insight — and can talk about how they discovered the insight, not just the feature.
Handles class imbalance / delayed labels / drifting distributions without panicking, because they've thought about these failure modes before.
Writes clean Python with tests, types, and small functions — like an engineer would, not like a notebook would.
Reasons about production from day one — latency, monitoring, retraining triggers, on-call.
Asks the right domain questions early, and uses the answers to shape the modeling approach.

If you can do those six things on demand, you're in.