12:53ISCA'25 - Session 5B - MeshSlice: Efficient 2D Tensor Parallelism for Distributed DNN TrainingACM SIGARCH4 viewsView & Download
30:05Scale ANY Model: PyTorch DDP, ZeRO, Pipeline & Tensor Parallelism Made Simple (2025 Guide)Zachary Mueller1.5K viewsView & Download
13:11ISCA'25 - Session 5B - Ecco: Improving Memory Bandwidth and Capacity for LLMs via Entropy-Aware CachACM SIGARCH19 viewsView & Download
17:03ISCA'25 - Session 5C - ATiM: Autotuning Tensor Programs for Processing-in-DRAMACM SIGARCH4 viewsView & Download
15:35ISCA'25 - Session 6A - Transitive Array: An Efficient GEMM Accelerator with Result ReuseACM SIGARCH1 viewsView & Download
19:48ISCA'25 - Session 7A - DiTile-DGNN: An Efficient Accelerator for Distributed Dynamic Graph Neural NeACM SIGARCH4 viewsView & Download
1:24:42Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 7: Parallelism 1Stanford Online43.8K viewsView & Download
14:09ISCA'25 - Session 6A - Phi: Leveraging Pattern-based Hierarchical Sparsity for High-Efficiency SpikiACM SIGARCH2 viewsView & Download