7:28DFlash: Block Diffusion for Flash Speculative Decoding, Doubles Token Per Second for Qwen 27bKnut Jägersberg88 viewsView & Download
4:48GitHub - z-lab/dflash: DFlash: Block Diffusion for Flash Speculative DecodingGitHub Daily Trend AI Podcast6 viewsView & Download
11:53Ep 34: Qwen3.6-27B paired with llama.cpp speculative decoding delivers 10x token speedups in real...Nerra Network38 viewsView & Download
10:06DFlash Leaves Qwen Territory - Gemma 4 31B Now Runs 5x Faster with Speculative DecodingFahd Mirza5.5K viewsView & Download
9:01Running a 27B model at 130 tokens sec on a single GPU Locally with Luce DFlashFahd Mirza10.0K viewsView & Download
3:55Qwen3.5-27B inference speed increased by 5 times! DFlash diffusion model enables large local mode...鲲鹏Talk1.1K viewsView & Download
7:17DFlash Deep Dive: Block Diffusion Makes LLM Inference 6x FasterEnchanted Storytime435 viewsView & Download
8:27600 Toks/Second Gemma4-26B —The Setting That Actually Wins (vLLM + Dflash Speculative Decoding)Tech-Practice4.0K viewsView & Download
15:31PFlash + Qwen3.6-27B-DFlash: 10x Faster Prefill on a Single GPU: Run LocallyFahd Mirza9.4K viewsView & Download
17:20TripoSplat & QWEN w. Lora - Move To Any New Camera Angle FastMark DK Berry982 viewsView & Download