Saved in:
| Main Authors: | Holmes, Connor, Tanaka, Masahiro, Wyatt, Michael, Awan, Ammar Ahmad, Rasley, Jeff, Rajbhandari, Samyam, Aminabadi, Reza Yazdani, Qin, Heyang, Bakhtiari, Arash, Kurilenko, Lev, He, Yuxiong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.08671 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
by: Li, Conglong, et al.
Published: (2022)
by: Li, Conglong, et al.
Published: (2022)
Scaling Vision Transformers: Evaluating DeepSpeed for Image-Centric Workloads
by: Trinh, Huy, et al.
Published: (2026)
by: Trinh, Huy, et al.
Published: (2026)
Shift Parallelism: Low-Latency, High-Throughput LLM Inference for Dynamic Workloads
by: Hidayetoglu, Mert, et al.
Published: (2025)
by: Hidayetoglu, Mert, et al.
Published: (2025)
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences
by: Bekman, Stas, et al.
Published: (2025)
by: Bekman, Stas, et al.
Published: (2025)
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI
by: Rajbhandari, Samyam, et al.
Published: (2025)
by: Rajbhandari, Samyam, et al.
Published: (2025)
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation
by: Qiao, Aurick, et al.
Published: (2024)
by: Qiao, Aurick, et al.
Published: (2024)
MoE-Prefill: Zero Redundancy Overheads in MoE Prefill Serving
by: Su, Zhaoyuan, et al.
Published: (2026)
by: Su, Zhaoyuan, et al.
Published: (2026)
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing
by: Hu, Lanxiang, et al.
Published: (2025)
by: Hu, Lanxiang, et al.
Published: (2025)
A Semi Centralized Training Decentralized Execution Architecture for Multi Agent Deep Reinforcement Learning in Traffic Signal Control
by: Rezaali, Arash, et al.
Published: (2025)
by: Rezaali, Arash, et al.
Published: (2025)
An Empirical Investigation of Speed Patterns on S‐Curves Using Naturalistic Driving Data and Mixed Logit Model
by: Cailin Lei, et al.
Published: (2025)
by: Cailin Lei, et al.
Published: (2025)
OWL: Overcoming Window Length-Dependence in Speculative Decoding for Long-Context Inputs
by: Lee, Jaeseong, et al.
Published: (2025)
by: Lee, Jaeseong, et al.
Published: (2025)
Flow Matching for Medical Image Synthesis: Bridging the Gap Between Speed and Quality
by: Yazdani, Milad, et al.
Published: (2025)
by: Yazdani, Milad, et al.
Published: (2025)
Universal Checkpointing: A Flexible and Efficient Distributed Checkpointing System for Large-Scale DNN Training with Reconfigurable Parallelis
by: Lian, Xinyu, et al.
Published: (2024)
by: Lian, Xinyu, et al.
Published: (2024)
ENTREVISTA COM RUBENS FIGUEIREDO
by: Alexey Kurilenko
Published: (2022)
by: Alexey Kurilenko
Published: (2022)
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
by: Xia, Haojun, et al.
Published: (2024)
by: Xia, Haojun, et al.
Published: (2024)
c: Not the Speed of Light. The Speed of Reality.
by: Holub, Pavel
Published: (2026)
by: Holub, Pavel
Published: (2026)
Speeding Up Optimization-based Motion Planning through Deep Learning
by: Tenhumberg, Johannes, et al.
Published: (2023)
by: Tenhumberg, Johannes, et al.
Published: (2023)
FastKernels: Benchmarking GPU Kernel Generation in Production
by: Oliaro, Gabriele, et al.
Published: (2026)
by: Oliaro, Gabriele, et al.
Published: (2026)
AssetGen: Deployable 3D Asset Generation at Interactive Speed
by: Wang, Dilin, et al.
Published: (2026)
by: Wang, Dilin, et al.
Published: (2026)
BrainVoxGen: Deep learning framework for synthesis of Ultrasound to MRI
by: Singh, Shubham, et al.
Published: (2023)
by: Singh, Shubham, et al.
Published: (2023)
Tibial Strains are Sensitive to Speed, but not Grade, Perturbations During Running
by: Baggaley, Michael, et al.
Published: (2023)
by: Baggaley, Michael, et al.
Published: (2023)
Infinitely growing configurations in Emil Post's tag system problem
by: Kurilenko, Nikita V.
Published: (2021)
by: Kurilenko, Nikita V.
Published: (2021)
Prompt-MII: Meta-Learning Instruction Induction for LLMs
by: Xiao, Emily, et al.
Published: (2025)
by: Xiao, Emily, et al.
Published: (2025)
FastPersist: Accelerating Model Checkpointing in Deep Learning
by: Wang, Guanhua, et al.
Published: (2024)
by: Wang, Guanhua, et al.
Published: (2024)
Tube Loss based Deep Networks For Improving the Probabilistic Forecasting of Wind Speed
by: Anand, Pritam, et al.
Published: (2025)
by: Anand, Pritam, et al.
Published: (2025)
Achieving Stable High-Speed Locomotion for Humanoid Robots with Deep Reinforcement Learning
by: Zhang, Xinming, et al.
Published: (2024)
by: Zhang, Xinming, et al.
Published: (2024)
Labor Market Effects of the Venezuelan Refugee Crisis in Brazil
by: Sant'Anna, Hugo, et al.
Published: (2023)
by: Sant'Anna, Hugo, et al.
Published: (2023)
Speed is Confidence
by: Dillon, Joshua V.
Published: (2026)
by: Dillon, Joshua V.
Published: (2026)
Deep Learning for High Speed Optical Coherence Elastography with a Fiber Scanning Endoscope
by: Neidhardt, Maximilian, et al.
Published: (2025)
by: Neidhardt, Maximilian, et al.
Published: (2025)
Deep Learning Methods for Adjusting Global MFD Speed Estimations to Local Link Configurations
by: Jin, Zhixiong, et al.
Published: (2024)
by: Jin, Zhixiong, et al.
Published: (2024)
TAG‐SPARK: Empowering High‐Speed Volumetric Imaging With Deep Learning and Spatial Redundancy
by: Yin‐Tzu Hsieh, et al.
Published: (2024)
by: Yin‐Tzu Hsieh, et al.
Published: (2024)
Federated Timeline Synthesis: Scalable and Private Methodology For Model Training and Deployment
by: Renc, Pawel, et al.
Published: (2025)
by: Renc, Pawel, et al.
Published: (2025)
On the ill-posed Cauchy problem for the polyharmonic heat equation
by: Kurilenko, Ilya, et al.
Published: (2022)
by: Kurilenko, Ilya, et al.
Published: (2022)
A Dual-Motor Actuator for Ceiling Robots with High Force and High Speed Capabilities
by: Lalonde, Ian, et al.
Published: (2024)
by: Lalonde, Ian, et al.
Published: (2024)
Modeling Driver Behavior in Speed Advisory Systems: Koopman-based Approach with Online Update
by: Ozkan, Mehmet Fatih, et al.
Published: (2025)
by: Ozkan, Mehmet Fatih, et al.
Published: (2025)
Wind as Driver of Bird and Bat Abundance, Flight Direction, Altitude, and Speed on the North Atlantic Shelf
by: Snortland, Abigale, et al.
Published: (2025)
by: Snortland, Abigale, et al.
Published: (2025)
Improvement of Speed Limits: Quantum Effect on the Speed in Open Quantum Systems
by: Sekiguchi, Kotaro, et al.
Published: (2024)
by: Sekiguchi, Kotaro, et al.
Published: (2024)
Detecting Car Speed using Object Detection and Depth Estimation: A Deep Learning Framework
by: Dasgupta, Subhasis, et al.
Published: (2024)
by: Dasgupta, Subhasis, et al.
Published: (2024)
Analysis of Centrifugal Clutches in Two-Speed Automatic Transmissions with Deep Learning-Based Engagement Prediction
by: Lin, Bo-Yi, et al.
Published: (2024)
by: Lin, Bo-Yi, et al.
Published: (2024)
Short‐Term Wind Speed Prediction Model Based on Hybrid Decomposition Method and Deep Learning
by: Xueqiong Yuan, et al.
Published: (2025)
by: Xueqiong Yuan, et al.
Published: (2025)
Similar Items
-
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
by: Li, Conglong, et al.
Published: (2022) -
Scaling Vision Transformers: Evaluating DeepSpeed for Image-Centric Workloads
by: Trinh, Huy, et al.
Published: (2026) -
Shift Parallelism: Low-Latency, High-Throughput LLM Inference for Dynamic Workloads
by: Hidayetoglu, Mert, et al.
Published: (2025) -
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences
by: Bekman, Stas, et al.
Published: (2025) -
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI
by: Rajbhandari, Samyam, et al.
Published: (2025)