7:40Speculative Decoding: 3× Faster LLM Inference with Zero Quality LossTales Of Tensors1.6K viewsView & Download
40:19Speculation is all you need: Intro to Speculative Decoding for High Performance InferenceModal891 viewsView & Download
4:53What is Speculative Decoding? making LLMs fasterData Science in your pocket60 viewsView & Download
12:25Speculative Decoding: Faster Inference for Transformers and LLMsThe Clue Matrix15 viewsView & Download
12:45Speculative Decoding & Inference Speed — 2-3x Faster LLMs With Zero Quality LossJeff Heidelberger3 viewsView & Download
7:52Accelerating LLM Inference on TPUs via Diffusion Speculative DecodingKnut Jägersberg11 viewsView & Download
10:14MLX India Community Meetup 1 | Boosting local model performance - Speculative decoding with DFlashConscious Engines96 viewsView & Download
11:0411 - Finding collisions among thousands of objects blazing fastTen Minute Physics38.0K viewsView & Download