0:36Efficient GPU-based Decompression of BTF Data Compressed using Multi-Level Vector QuantizationCESCG329 viewsView & Download
9:58CS680: Data Transfer With CPU Compression and GPU DecompressionRobert Xing135 viewsView & Download
19:46Quantization vs Pruning vs Distillation: Optimizing NNs for InferenceEfficient NLP65.4K viewsView & Download
6:49TurboQuant Explained: Online Vector Quantization with Near-Optimal Distortion for LLMsmathtartic453 viewsView & Download
13:53Residual Vector Quantization for Audio and Speech EmbeddingsEfficient NLP11.4K viewsView & Download
21:04LLM Context & Memory Compression: How to Achieve Lossless Speed.Byte Goose AI.549 viewsView & Download
17:24Improving LLM Throughput via Data Center-Scale Inference OptimizationsNVIDIA Developer1.6K viewsView & Download
0:40A GPU-based Approach for Massive Model Rendering with Frame-to-Frame CoherenceYong Cao366 viewsView & Download
6:12BTF Prediction Model using Unsupervised LearningComputer Science & IT Conference Proceedings200 viewsView & Download