18:01[PLDI24] A Verified Compiler for a Functional Tensor LanguageACM SIGPLAN210 viewsView & Download
3:21How DDP works || Distributed Data Parallel || Quick explainedDevelopers Hutt6.1K viewsView & Download
5:21NVIDIA B300 reference design natively faces steep degradation in Model Flops Utilization (MFU).Tensor 29 viewsView & Download
1:26:22GPU Accelerated AI and Physics-Based Structure and Ligand Based Screening of Ultra Large LibrariesMolSoft Molecules in Silico166 viewsView & Download
20:18LLM Inference Optimization #2: Tensor, Data & Expert Parallelism (TP, DP, EP, MoE)Faradawn Yang4.4K viewsView & Download
18:11[PLDI24] Compilation of Modular and General Sparse WorkspacesACM SIGPLAN333 viewsView & Download
15:02OSDI '24 - Enabling Tensor Language Model to Assist in Generating High-Performance Tensor...USENIX308 viewsView & Download
35:06Using reduced numerical precision on Pascal, Volta and Turing GPUsSharcnet HPC204 viewsView & Download