3:58LLMSurgeon: Decoding the Secret Recipes of Big Tech AI ModelsSummarized Science1 viewsView & Download
15:15How to make LLMs fast: KV Caching, Speculative Decoding, and Multi-Query Attention | Cursor TeamLex Clips13.8K viewsView & Download
17:20Structured Output from LLMs: Grammars, Regex, and State MachinesEfficient NLP9.4K viewsView & Download
21:01From 46% to 90%: Fine-Tuning Tiny LLMs for On-Device Agents — Cormac Brick, GoogleAI Engineer45.3K viewsView & Download