Machine Learning Engineer

Building systems that
think at scale

Senior ML engineer with 12+ years designing large-scale learning systems — from recommendation engines serving billions of daily predictions to production LLM pipelines. Passionate about turning research into real-world impact.

LLMs Distributed Training MLOps Reinforcement Learning Feature Stores PyTorch
12+
Years in ML
40+
Publications
1B+
Daily Predictions
18
Talks Delivered
Selected Work

Scalable ML Systems

Production systems designed for reliability, throughput, and real-world impact — from infrastructure foundations to model-level innovations.

🧠
Distributed Feature Store

Designed and deployed a low-latency feature store handling 2M+ QPS across globally distributed data centers, cutting p99 inference latency by 42%.

Apache Flink Redis Kubernetes gRPC
Case Study
🤖
LLM Fine-Tuning Pipeline

Built a scalable RLHF fine-tuning framework for domain-adapted LLMs, reducing alignment training cost by 60% while maintaining benchmark parity.

PyTorch FSDP DeepSpeed RLHF PEFT
Paper
Real-Time Recommendation Engine

Led architecture for a two-tower retrieval + reranking system serving 500M+ daily active users, improving CTR by 18% through contextual bandits.

TF Serving Faiss Bandits A/B Testing
Blog Post
📊
ML Observability Platform

Open-sourced an end-to-end model monitoring toolkit covering data drift, concept drift, and automated retraining triggers — adopted by 200+ teams.

Evidently AI Prometheus Airflow MLflow
GitHub
Thought Leadership

Speaking & Publications

Sharing ideas at the intersection of systems engineering and machine learning research.

Conference Talks
2024
NeurIPS 2024
Scaling RLHF Beyond Human Feedback
Main track, invited speaker
2024
MLSys 2024
Feature Stores at Petabyte Scale
Systems track, 1200+ attendees
2023
KDD 2023
Contextual Bandits in Production
Industry day keynote
2022
ICML 2022
Efficient Model Monitoring at Scale
Workshop on Responsible ML
Selected Publications
2024
arXiv · 2024
Efficient RLHF via Synthetic Constitutional Feedback
Mercer et al. — 420+ citations
2023
JMLR · 2023
Low-Latency Feature Serving with Consistency Guarantees
Mercer, Li, Anand — best paper nominee
2022
NeurIPS · 2022
Drift-Adaptive Online Learning for Recommender Systems
Spotlight presentation
2021
ICML · 2021
Scalable Gradient Checkpointing for Transformer Models
Mercer et al. — 900+ citations

Let's build something remarkable

Open to advisory roles, research collaborations, conference talks, and senior engineering opportunities. Always happy to connect.

Available for new engagements