
Instructor-Led Training
Model Optimization & Deployment – vLLM, TGI, Quantization
The definitive 3-day bootcamp that turns DevOps and platform engineers into masters of high-performance, low-cost LLM serving.
Model Optimization & Deployment – vLLM, TGI, Quantization Course Overview
The definitive 3-day bootcamp that turns DevOps and platform engineers into masters of high-performance, low-cost LLM serving. Using your own fine-tuned private models (from AI-301), you will quantize, optimize, and deploy them at 100+ tokens/sec on a single GPU, or 1000+ concurrent users on modest hardware with vLLM, Text Generation Inference, and the latest 2025 quantization tricks. Every student leaves with a production-grade inference service running.