8:12Optimizing GPU Parallelization for Model Inference on DatabricksVectorLab242 viewsView & Download
33:39Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark MoyouAI Engineer45.7K viewsView & Download
17:24Improving LLM Throughput via Data Center-Scale Inference OptimizationsNVIDIA Developer1.6K viewsView & Download
23:32Scaling LLM Workloads with Serverless Batch Inference on DatabricksVectorLab524 viewsView & Download
3:36LLM2 Module 3 - Deployment and Hardware | 3.6 Current Best PracticesDatabricks857 viewsView & Download
22:58Efficient Large-Scale Language Model Training on GPU ClustersDatabricks7.7K viewsView & Download
12:33Parallel table ingestion with a Spark Notebook (PySpark + Threading)Dustin Vannoy17.3K viewsView & Download
36:10Scaling Generative AI: Batch Inference Strategies for Foundation ModelsDatabricks450 viewsView & Download
10:57Parallel Track Transformers Explained (vLLM) – Reducing GPU Sync in LLM InferenceMachine Learning with PyTorch91 viewsView & Download
30:59Training Distributed Deep Recurrent Neural Networks with Mixed Precision on GPU ClustersDatabricks643 viewsView & Download
1:30:18Apache Spark Core—Deep Dive—Proper Optimization Daniel Tomes DatabricksDatabricks208.4K viewsView & Download