22:58Efficient Large-Scale Language Model Training on GPU ClustersDatabricks7.7K viewsView & Download
24:04Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM | Jared Casper@Scale8.3K viewsView & Download
8:17Efficient Large Scale Language Model Training on GPU Clusters Using Megatron LM作業用Podcast18 viewsView & Download
37:36RAS: Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM - G. PerrottaLSC - UNICAMP367 viewsView & Download
6:38Harnessing Kubernetes for efficient Large Language Model (LLM) training | Abdel SghiouarAll Things Open328 viewsView & Download
56:00Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83Stanford MLSys Seminars16.4K viewsView & Download
1:22:58Ultimate Guide To Scaling ML Models - Megatron-LM | ZeRO | DeepSpeed | Mixed PrecisionAleksa Gordić - The AI Epiphany36.1K viewsView & Download
1:12:53Stanford CS231N | Spring 2025 | Lecture 11: Large Scale Distributed TrainingStanford Online46.1K viewsView & Download
14:18USENIX ATC '21 - ZeRO-Offload: Democratizing Billion-Scale Model TrainingUSENIX2.6K viewsView & Download
16:09NSDI '25 - Holmes: Localizing Irregularities in LLM Training with Mega-scale GPU ClustersUSENIX154 viewsView & Download
8:05How to Design a GPU Cluster for AI Training - The Deep Learning System Design InterviewPeetha Academy 842 viewsView & Download
7:50Calculate ATTENTION Faster On GPU Cluster - Core Attention DisaggregationVuk Rosić143 viewsView & Download
4:20How are LLMs Trained? Distributed Training in AI (at NVIDIA)What's AI by Louis-François Bouchard5.9K viewsView & Download
58:32Exploiting Parallelism in Large Scale DL Model Training: From Chips to Systems to AlgorithmsNCSAatIllinois299 viewsView & Download