ML Production Pipeline

The problem

Training a model is a milestone, not a finish line. Serving, logging predictions, and detecting when live data drifts from training distributions are what separate a notebook from a system.

What I built

An end-to-end ML production pipeline: scikit-learn training with MLflow tracking, FastAPI serving with Prometheus metrics, Redis-backed prediction logging, and a Go drift detector computing PSI and KS statistics over sliding windows — with webhook alerts when distributions shift.

Why I’m building this

I’m an ML learner by intentional choice — extending a platform and reliability background into model operations. The same questions apply: what breaks silently, what do you measure, and who gets paged when behavior changes?

What I learned

Python excels at training and serving ergonomics; Go fits streaming statistical work without GIL contention. MLflow gives experiment reproducibility; drift detection needs careful baseline selection and window sizing or you’ll alert on noise.

Repo

Full source and design notes are on GitHub.