14:44Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding (MAI Paper Slop195 viewsView & Download
6:20I Tested the First Diffusion Reasoning LLM… It’s Insanely FastSkill Leap AI8.8K viewsView & Download
9:02DMax-Coder-16B: Diffusion LLM That Generates All Tokens at Once | Run LocallyFahd Mirza4.6K viewsView & Download
9:06What is Prompt Caching? Optimize LLM Latency with AI TransformersIBM Technology89.3K viewsView & Download
4:58What is vLLM? Efficient AI Inference for Large Language ModelsIBM Technology83.2K viewsView & Download
9:093x Better DIFFUSION LLM - Loopholing Discrete Diffusion - Simple TrickVuk Rosić1.1K viewsView & Download
12:11LLM generates the ENTIRE output at once (world's first diffusion LLM)Matthew Berman196.6K viewsView & Download
6:00The Probability Bottleneck in Diffusion LLMs: Why Parallel Decoding Is Not FreeXiaol.x52 viewsView & Download
2:12Optimize, deploy, and benchmark an open-source LLM with vLLMDeepLearningAI4.1K viewsView & Download
7:17DFlash Deep Dive: Block Diffusion Makes LLM Inference 6x FasterEnchanted Storytime433 viewsView & Download