18:48NEW: Unlimited Token Length for LLMs by Microsoft (LongNet explained)Discover AI3.5K viewsView & Download
6:52Tokens vs Embeddings – what are they + how are they different?Annie Sexton35.7K viewsView & Download
11:34Generate 10 Tokens At Once - Faster LLM INFERENCE - AdaSPEC - Speculative Decoding ImprovementVuk Rosić520 viewsView & Download
1:59The Local LLM Lie Nobody Talks About: Why "Tokens Per Second" is a Scam for AI AgentsVeselin Vasilev989 viewsView & Download
18:32AI Cost Optimization | Episode_05 | Reduce Output Tokens- Max_TokensHuman Mimics AI26 viewsView & Download
11:53Greedy? Min-p? Beam Search? How LLMs Actually Pick Words – Decoding Strategies ExplainedAI Coffee Break with Letitia7.0K viewsView & Download
17:39AI Cost Optimization | Episode_04 | Reduce Output Tokens Using System PromptsHuman Mimics AI31 viewsView & Download