1:49:25Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 8 - LLM EvaluationStanford Online63.1K viewsView & Download
55:02How to Systematically Setup LLM Evals (Metrics, Unit Tests, LLM-as-a-Judge)Dave Ebbelaar56.4K viewsView & Download
5:507 Popular LLM Benchmarks Explained [OpenLLM Leaderboard & Chatbot Arena]bycloud28.3K viewsView & Download
9:19LLM Benchmarking | How one LLM is tested against another? | LLM Evaluation Benchmarks | SimplilearnSimplilearn2.7K viewsView & Download
30:56What Do LLM Benchmarks Actually Tell Us? (+ How to Run Your Own)Adam Lucek9.3K viewsView & Download
10:13LLM-as-Judge: Evaluating writing quality without ground truthEfficient NLP2.3K viewsView & Download
6:57How to Choose Large Language Models: A Developer’s Guide to LLMsIBM Technology105.3K viewsView & Download
45:03The Science of LLM Benchmarks: Methods, Metrics, and Meanings | LLMOpsLLMOps Space3.9K viewsView & Download
31:48Finding the Right Datasets and Metrics for Evaluating LLM PerformanceWhyLabs483 viewsView & Download
15:30Don’t trust LLM benchmarks - Testing OpenAI GPT 5.2 in 🤖 Agent ZeroAgent Zero7.7K viewsView & Download
31:45A Practical Guide to LLM Evaluation - Michelle YiOpen Data Science and AI Conference359 viewsView & Download
26:37Intel Arc Pro B70 (32GB) for Local LLMs: llama.cpp (SYCL/Vulkan), vLLM (Intel LLM Scaler) BenchmarksDonato Capitella22.0K viewsView & Download