Instructor-Led Training

Model Optimization & Deployment – vLLM, TGI, Quantization

The definitive 3-day bootcamp that turns DevOps and platform engineers into masters of high-performance, low-cost LLM serving.

Model Optimization & Deployment – vLLM, TGI, Quantization Course Overview

The definitive 3-day bootcamp that turns DevOps and platform engineers into masters of high-performance, low-cost LLM serving. Using your own fine-tuned private models (from AI-301), you will quantize, optimize, and deploy them at 100+ tokens/sec on a single GPU, or 1000+ concurrent users on modest hardware with vLLM, Text Generation Inference, and the latest 2025 quantization tricks. Every student leaves with a production-grade inference service running.

Course Details

Download PDF

Model Optimization & Deployment – vLLM, TGI, Quantization

Model Optimization & Deployment – vLLM, TGI, Quantization Course Overview

Course Details

Outline

Objectives

Audience

Prerequisites