Under the Hood: Building a Low-Cost, Planet-Scale Compliance Pipeline on Azure
Reading time: 8 min
Guiding Principles
- Spend ≠ Success – Burn only what validates product-market fit.
- Containers Everywhere – Identical images lift straight from dev to AKS.
- Immutable Evidence – If it can be deleted, it can be subpoenaed.
Key Components & Cost Drivers
| # | Component | SKU | Cost @ 20 k scans | Notes |
|---|---|---|---|---|
| 1 | API Management (Consumption) | $0.03/M calls | $0.60 | Rate-limits & keys |
| 2 | GPU Container App (T4) | pay-per-second | $0.04 | Scales → 0 |
| 3 | Postgres B1ms | 1 vCPU | $12.00 | RLS & partitioning |
| 4 | Event Hub Basic | $0.028/M events | $1.12 | Telemetry & triggers |
| Total | ≈ $17-18 | West Europe prices |
Latency & Scale Benchmarks
- 95th % latency: 780 ms for 5 MB JPEGs.
- Throughput: Sustains 65 req/s before GPU saturation; AKS lift bumps headroom ×10.
Migration Path in One Terraform Diff
azurerm_container_app → azurerm_kubernetes_cluster_node_pool is literally a variable switch. Secrets remain in Key Vault; detectors continue running unchanged.
Lessons Learned
- GPU Spot Instances slash inference cost, but you must script a hard shutdown cron to avoid zombie charges.
- Caffeine in-proc cache beats Redis for cold-start bills when traffic is bursty (<10 req/min off-peak).
Call to Action
Clone the public terraform-bootstrap repo and push your first compliance scan for pennies.