1:47Make Large Language Models 4× Faster! Jacobi Forcing for Causal Parallel Decoding ExplainedAITech_Trends14 viewsView & Download
9:39Faster LLMs: Accelerate Inference with Speculative DecodingIBM Technology26.1K viewsView & Download
23:32Parallel Decoding: New Standard for Fast LLM Inference. Jacobi Iterations, Multi-Token Prediction.Byte Goose AI.1.8K viewsView & Download
12:45Speculative Decoding & Inference Speed — 2-3x Faster LLMs With Zero Quality LossJeff Heidelberger2 viewsView & Download
15:15How to make LLMs fast: KV Caching, Speculative Decoding, and Multi-Query Attention | Cursor TeamLex Clips13.8K viewsView & Download
6:53How Speculative Decoding Makes LLMs 2.5x Faster (The Secret to Faster AI)FranksWorld of AI165 viewsView & Download
11:18[2024 Best AI Paper] CLLMs: Consistency Large Language ModelsPaper With Video46 viewsView & Download
13:32"Fast LLM Collaborative Decoding via Speculation" Explained (Manim Animation) | ICML 2025Jiale Fu116 viewsView & Download
6:00The Probability Bottleneck in Diffusion LLMs: Why Parallel Decoding Is Not FreeXiaol.x48 viewsView & Download