Teaching
I write about production AI architecture in three places. Notes are technical primers. Blog posts are longer pieces with real numbers. Open-source code is the working version of the patterns I write about.
This page organizes that body of work into learning paths. Each path is a curated reading order across notes, posts, and OSS, plus a structured course that's coming as the material gets deep enough to teach. If you're new to production AI, start at the top. If you're a senior engineer with a specific question, jump to the path that matches it.
How this works
Each path below pulls from material I've already shipped. Where the path says course coming, the underlying notes and posts are still useful as a self-guided reading list today. The structured course (lessons, exercises, code) lands once enough material accumulates to make it worth charging for.
The paths are the same skill stack I'm climbing in public. Foundations first, then production architecture, then specialty depth. Skipping foundations shows up as a wall later.
Paths
AWS Cost Optimization for AI Workloads
For senior engineers and CTOs whose AWS bill is climbing faster than revenue. The seven levers in priority order, the math on each, and the production patterns that actually move the bill.
Voice AI Architecture on AWS Bedrock
For founders and engineers building voice products that need to feel natural. Latency budgets, model selection, multi-tenant patterns, and the failure modes that show up only in production.
- Voice AI on Bedrock at sub-300ms — what we tried and what worked (June 2026)
- Voice failure mode taxonomy from production (October 2026)
Multi-Tenant AI Architecture
For Series A-C startups whose product serves more than one customer from a shared inference backbone. Cost attribution, data isolation, tenant routing, and the patterns that don't break under load.
- Multi-tenant cost attribution: a reference architecture (July 2026)
- Cross-account AWS for AI startups (September 2026)
AI Security and Production Hardening
For engineers shipping AI to real users in environments where the cost of getting it wrong is real. Prompt injection, output sanitization, multi-tenant isolation, PII handling, IAM for AI services, and the research-side mechanisms (Constitutional AI, RLHF) that explain why models behave the way they do.
- Prompt injection in production: the three categories (planned)
- Output sanitization patterns for LLM responses (planned)
- Multi-tenant prompt isolation for Bedrock workloads (planned)
Production AI Architect — Beginner Path
For engineers new to production AI. Starts at the foundations and works up through the same skill stack the Karpathy and Hamel curricula assume. Built to be the path I wish I'd had when I started.
- Bedrock pricing model from scratch (planned)
- Token economics: input vs output, cached vs fresh (planned)
- The four Bedrock pricing modes and when each wins (planned)
LLM Inference Engineering — Intermediate Path
For senior engineers who want to understand what's happening below the API surface. Prefill vs decode, KV cache, continuous batching, quantization, the math that drives most cost decisions.
- Prefill vs decode and the cost asymmetry it creates (planned)
- KV cache mechanics for production engineers (planned)
- Continuous batching: what vLLM does and why it matters (planned)
Distributed AI Systems — Advanced Path
For staff and principal engineers building AI infrastructure at scale. Tensor parallelism, FSDP, the consensus papers underneath production systems, and the data systems patterns that compound across both classic SaaS and AI workloads.
- Designing Data-Intensive Applications, chapter by chapter (planned)
- Tensor parallelism explained for production engineers (planned)
Want to be notified when courses launch?
Follow on X or LinkedIn. Each course will get an announcement post when the first lesson ships. In the meantime, the underlying material (notes, posts, OSS) is already available at the links above.