Orchestration Patterns
Five architectural patterns for routing verification traffic to one or more vendors — single, waterfall, A/B, geo-routed, and decision-engine. With trade-offs, when to pick each, and config-shape examples you can adapt.
Build an abstraction layer first — before any pattern
Whichever pattern you adopt, do not let vendor SDKs leak into your product domain. The single most reused piece of advice in this whole playbook: wrap the vendor in your own interface from day one, even if you only have one vendor.
from dataclasses import dataclass
from enum import Enum
from typing import Optional
class Decision(str, Enum):
APPROVED = "approved"
DECLINED = "declined"
MANUAL_REVIEW = "manual_review"
INDETERMINATE = "indeterminate" # rare; vendor failure / error
@dataclass(frozen=True)
class VerificationRequest:
user_id: str
jurisdiction: str # ISO 3166-1 alpha-2
tier: str # "lite" | "full" | "kyb"
requested_modules: list[str] # ["doc", "biometric", "sanctions", "pep", "kyb"]
locale: str
return_url: str
idempotency_key: str
@dataclass(frozen=True)
class VerificationResult:
decision: Decision
confidence: float # 0.0–1.0
vendor: str
vendor_reference: str # vendor's case ID for audit
reasons: list[str] # vendor-mapped to your taxonomy
raw_response: dict # for audit; never surfaced to UI
latency_ms: int
cost_cents: int # for unit economics tracking
expires_at: Optional[str] = None # for re-verification
class IDVProvider:
"""Vendor-agnostic interface. Every vendor adapter implements this."""
name: str
def start(self, req: VerificationRequest) -> str: ... # returns session_id / URL
def status(self, session_id: str) -> VerificationResult: ...
def webhook(self, payload: dict, signature: str) -> VerificationResult: ...The abstraction layer is what makes every other pattern in this chapter cheap. With it, swapping vendor B for vendor C is a 1-week engineering project. Without it, every pattern below becomes a 3-month rebuild. Build the abstraction even if you currently have only one vendor.
Pattern 1 — Single vendor
One provider. Simplest. Sufficient for many companies.
Pros
- Cheapest to integrate and operate.
- Best volume tier from a single vendor (better unit economics in pure terms).
- Compliance posture is simple to explain to regulators.
- One support escalation path during incidents.
Cons
- No redundancy. Vendor outage = signup outage.
- No negotiating leverage at renewal — they know you're locked in.
- No A/B comparison to benchmark performance.
- Geographic gaps in vendor's coverage become your gaps.
Pick this when
Pre-product-market-fit; single jurisdiction; volume below ~250K checks/year. Plan to revisit at 500K+.
Pattern 2 — Waterfall (cascade)
Primary vendor handles everything. On rejection or indeterminate result, secondary vendor gets a second look. Reduces false-reject rate at the cost of unit economics and complexity.
Pros
- Higher overall completion rate; recovers users vendor A would reject.
- Some redundancy: if vendor A is degraded, you have an option.
- Cleaner regulator story than parallel/A/B ("we use vendor A; we use B as backup").
Cons
- You pay vendor A and vendor B for users that fall through.
- Latency stacks: the worst case is vendor A timeout + vendor B full flow.
- Vendor B is now in your data path — full DD applies to both vendors.
- Decision provenance can confuse auditors: clearly tag which vendor's evidence the final decision rests on.
Config example
orchestration:
pattern: waterfall
steps:
- vendor: vendor_a
stop_on: [approved]
fallback_on: [declined, indeterminate]
timeout_ms: 30000
- vendor: vendor_b
stop_on: [approved, declined]
fallback_on: [indeterminate]
timeout_ms: 30000
on_terminal_indeterminate: manual_review
audit:
record_all_vendor_responses: true
final_decision_provenance: "last_terminal_vendor"Pick this when
You have a measurable false-reject problem (≥3%) on your primary vendor. Or you operate in a jurisdiction where one vendor doesn't cover a meaningful sub-population (e.g., asylum-seeker documents). Don't adopt this before you've measured the false-reject; you'll just pay more.
Pattern 3 — A/B (cohort split)
Each user is assigned to a vendor based on a stable hash. You measure both in production and use the data to make a final call — or to keep both active as a benchmarking arrangement.
Pros
- Real, head-to-head measurement on your population.
- Ongoing negotiating leverage — both vendors know they're being compared.
- Risk diversification: one vendor's outage takes 50% of users, not 100%.
- Easy fallback if a vendor degrades — flip the % to 0 in seconds.
Cons
- Worst volume tier — you split commitments across two vendors.
- Inconsistent user experience (if you upsell to the same user later, the cohort lock-in matters).
- More engineering: two integrations, two webhooks, two on-call escalation paths.
- Regulator may ask "why two vendors" — answer should be measurement / redundancy, not "we couldn't decide."
Config example
orchestration:
pattern: ab
assignment:
key: user_id
sticky: true
splits:
- vendor: vendor_a
weight: 50
- vendor: vendor_b
weight: 50
emergency_override:
enabled: true
fallback_vendor: vendor_a
trigger:
vendor_b_error_rate_5m: ">= 0.05"Pick this when
You're at a scale where 5–10% volume per arm produces stat-sig results in a 4-week window (~50K+ checks per arm). You want measurement-driven choice or ongoing pricing leverage from two real options.
To detect a 1-point completion-rate difference (e.g., 86% vs 87%) with p=0.05 and 80% power, you need ~25K verifications per arm. For a 0.5-point false-reject difference (e.g., 2.0% vs 2.5%), it's closer to ~50K per arm. Plan A/B duration around these, not vibes.
Pattern 4 — Geo-routing
Different vendors for different countries. Common when no single vendor wins on coverage everywhere — typically with an EM-focused vendor in LATAM / MENA / CIS and an incumbent in NA / EU.
Pros
- Best per-country coverage and completion rates.
- Can negotiate vendor-specific deals (volume tier + market focus).
- Aligns with regulator preferences in regulated jurisdictions.
Cons
- Operational complexity: every new vendor is a new full DD + integration.
- You may end up with 4+ vendors, which is painful at audit time.
- Edge cases at borders (a German user signing up from a US IP).
- Aggregate reporting requires normalization across vendors.
Config example
{
"orchestration": {
"pattern": "geo_routing",
"rules": [
{ "match": { "country": ["DE", "AT", "CH"] }, "vendor": "vendor_idnow" },
{ "match": { "country": ["MX", "BR", "CO", "AR"] }, "vendor": "vendor_incode" },
{ "match": { "country": ["IN"] }, "vendor": "vendor_sumsub" },
{ "match": { "country": ["US", "CA"] }, "vendor": "vendor_persona" },
{ "match": { "country": "*" }, "vendor": "vendor_persona" }
],
"country_resolution": {
"primary": "user_declared",
"verify_against": ["ip_geolocation", "document_country"],
"on_mismatch": "manual_review"
}
}
}Pick this when
You serve 3+ jurisdictions with materially different document landscapes (e.g., DE/AT/CH video-ident plus EM countries) and your top vendor's coverage matrix has obvious gaps.
Pattern 5 — Decision-engine (weighted scoring)
Each verification produces a score from multiple signals (vendor decision, vendor confidence, sanctions hit, internal risk model, device signals, behavioral analytics). A decision engine combines them into the final approve/decline/review verdict — sometimes using multiple vendors in parallel for the same user.
Pros
- Highest accuracy in mature operations; combines best-of-breed signals.
- Granular control: tune weights per cohort / jurisdiction / risk tier.
- Future-proof: adding a new signal is a weight change, not an architectural shift.
- Internal rules engine becomes the source of truth — vendors are subordinated.
Cons
- Highest complexity; requires a real risk / decisioning function to operate.
- Requires more vendor unit-economics: most signals are billed independently.
- Hardest to explain to regulators ("show me your decision logic"). You'll need versioned rules, model cards, full audit trails.
- Decision-engine ownership becomes a long-lived team commitment.
Config example (Python-shape)
def decide(signals: dict, tier: str) -> Decision:
"""
signals: {
'vendor_a_decision': 'approved',
'vendor_a_confidence': 0.92,
'sanctions_hit': False,
'pep_match': False,
'fraud_score': 0.13, # internal fraud model 0–1
'device_risk': 'low',
'doc_country_matches_ip': True,
}
"""
if signals['sanctions_hit']:
return Decision.MANUAL_REVIEW # mandatory human review on hits
if signals['vendor_a_decision'] == 'declined':
# Vendor A is authoritative on doc/biometric rejection
return Decision.DECLINED
# Composite score: 0–100, higher = more confident approval
score = 0
score += int(signals['vendor_a_confidence'] * 50) # max 50
score += 20 if signals['device_risk'] == 'low' else 0
score += 15 if signals['doc_country_matches_ip'] else 0
score -= int(signals['fraud_score'] * 30) # penalty
if signals['pep_match']:
score -= 25
thresholds = {
'lite': {'approve': 60, 'review': 35},
'full': {'approve': 75, 'review': 55},
'kyb': {'approve': 80, 'review': 60},
}[tier]
if score >= thresholds['approve']:
return Decision.APPROVED
if score >= thresholds['review']:
return Decision.MANUAL_REVIEW
return Decision.DECLINEDPick this when
You're at $XX M+ revenue, have a dedicated risk function, run multiple jurisdictions, and your false-reject economics justify the engineering investment. Don't start here.
How to choose
| Pattern | Eng cost | Vendor cost | Resilience | Leverage | When to choose |
|---|---|---|---|---|---|
| Single | Low | Best | Low | None | Pre-PMF; <250K/yr; one jurisdiction |
| Waterfall | Medium | Worse (pay both) | Medium | Medium | Have measured false-reject >3% |
| A/B | Medium-High | Worse (split tier) | High | High (ongoing) | Volume supports stat-sig comparisons (≥50K/arm/month) |
| Geo-routing | High | Mixed | Medium-High | Medium | 3+ jurisdictions with no single coverage winner |
| Decision-engine | Very High | Highest | Very High | Very High | Mature risk function, >$XXM revenue |
Typical evolution path
Most companies pass through these in order:
- Year 0–1: Single vendor. Build the abstraction layer. Measure baseline.
- Year 1–2: Add a second vendor for either waterfall (if false-reject is the problem) or geo (if coverage is the problem). Many teams stop here.
- Year 2–3: A/B on top of geo, to keep both shortlist vendors honest and competitive.
- Year 3+: Build the decision engine. Vendors become signal providers, not decision-makers.
Trying to start at step 4 has burned a lot of teams. The decision-engine without baseline data isn't measurably better than single-vendor.
Observability invariants — same for every pattern
Whichever pattern you adopt, these dashboards and audit invariants must exist before you flip the switch. They're what you use to debug, what regulators ask for, and what tells you when to switch vendors.
Per-verification audit record
Every verification — successful, failed, abandoned, retried — must produce one canonical event with the schema below. Stored append-only, retained per regulatory floor.
{
"verification_id": "ver_01HZX...",
"user_id": "usr_01HZX...",
"tier": "full",
"jurisdiction": "DE",
"orchestration_pattern": "geo_routing",
"vendor_calls": [
{
"vendor": "vendor_idnow",
"vendor_reference": "idnow-12345",
"started_at": "2026-05-12T13:14:00Z",
"completed_at": "2026-05-12T13:14:42Z",
"latency_ms": 42000,
"decision": "approved",
"confidence": 0.94,
"cost_cents": 185,
"raw_response_ref": "s3://audit-raw/ver_01HZX/idnow-12345.json"
}
],
"internal_signals": {
"fraud_score": 0.07,
"device_risk": "low",
"sanctions_hit": false
},
"final_decision": "approved",
"final_decision_by": "decision_engine_v3.2.1",
"decided_at": "2026-05-12T13:14:43Z",
"audit_signature": "sha256:..."
}Dashboards (minimum set)
- Funnel: started → submitted → decided → approved, by vendor, by country.
- Decision mix: approve / decline / manual-review / indeterminate, by vendor.
- Latency: p50 / p95 / p99 time-to-decision, by vendor, by decision type.
- Error rate: vendor 5xx / timeout / contract violation, by vendor, last 24h.
- Unit cost: blended cost-per-decision, by vendor, by decision type.
- Manual-review queue: depth, age, throughput, by vendor.
- Dispute rate: users who claim they were wrongly declined, by vendor.
The number-one cause of "we don't know why our completion rate dropped" is missing dashboards. Build them in the integration phase, not after the incident.