11:53Greedy? Min-p? Beam Search? How LLMs Actually Pick Words – Decoding Strategies ExplainedAI Coffee Break with Letitia6.9K viewsView & Download
10:46GenAI: LLM Decoding Strategies Explained | Greedy, Beam, Top-k, Top-p, Temperature, ContrastiveBaba's World1.7K viewsView & Download
9:39Faster LLMs: Accelerate Inference with Speculative DecodingIBM Technology26.1K viewsView & Download
24:10Decoding Strategies in LLMs (Explained Simply) | How LLMs Choose the Next TokenMrinal Rawat75 viewsView & Download
17:20Structured Output from LLMs: Grammars, Regex, and State MachinesEfficient NLP9.4K viewsView & Download
15:15How to make LLMs fast: KV Caching, Speculative Decoding, and Multi-Query Attention | Cursor TeamLex Clips13.8K viewsView & Download
27:14Transformers, the tech behind LLMs | Deep Learning Chapter 53Blue1Brown10.3M viewsView & Download
7:40Speculative Decoding: 3× Faster LLM Inference with Zero Quality LossTales Of Tensors1.6K viewsView & Download
17:52AI Optimization Lecture 01 - Prefill vs Decode - Mastering LLM Techniques from NVIDIAFaradawn Yang14.3K viewsView & Download
1:47:10Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 6 - LLM ReasoningStanford Online55.2K viewsView & Download
5:14LLM Tokenizers Explained: BPE Encoding, WordPiece and SentencePieceDataMListic55.0K viewsView & Download