9:39Faster LLMs: Accelerate Inference with Speculative DecodingIBM Technology26.4K viewsView & Download
22:36MASSIVELY speed up local AI models with Speculative Decoding in LM StudioGosuCoder21.2K viewsView & Download
7:48Why using a dumb language model can speed up a smarter one: Speculative Decoding [Lecture]Jordan Boyd-Graber235 viewsView & Download
40:19Speculation is all you need: Intro to Speculative Decoding for High Performance InferenceModal884 viewsView & Download
4:53What is Speculative Decoding? making LLMs fasterData Science in your pocket60 viewsView & Download
15:15How to make LLMs fast: KV Caching, Speculative Decoding, and Multi-Query Attention | Cursor TeamLex Clips13.8K viewsView & Download
8:44How to PROPERLY Use Speculative Decoding in LM Studio to DOUBLE Your AI SpeedAsapGuide3.9K viewsView & Download
7:40Speculative Decoding: 3× Faster LLM Inference with Zero Quality LossTales Of Tensors1.6K viewsView & Download
6:53How Speculative Decoding Makes LLMs 2.5x Faster (The Secret to Faster AI)FranksWorld of AI165 viewsView & Download
1:51MTP Speculative Decoding Explained: How AI Models Generate FasterTyrel Barstow9 viewsView & Download
40:32ML Performance Reading Group 23: DFlash: Block Diffusion for Flash Speculative DecodingEleutherAI603 viewsView & Download
8:58Speculative Decoding Part 1: Why and how can a smaller LLM accelerate a bigger LLM?Rohithp106 viewsView & Download
11:34Generate 10 Tokens At Once - Faster LLM INFERENCE - AdaSPEC - Speculative Decoding ImprovementVuk Rosić525 viewsView & Download