Guardado en:
| Autores principales: | Hussain, Syed Rafiul, McDaniel, Patrick, Gandhi, Anshul, Ghose, Kanad, Gopalan, Kartik, Lee, Dongyoon, Liu, Yu David, Liu, Zhenhua, Mu, Shuai, Zadok, Erez |
|---|---|
| Formato: | Preprint |
| Publicado: |
2023
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2307.11993 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
NCCLbpf: Verified, Composable Policy Execution for GPU Collective Communication
por: Zheng, Yusheng
Publicado: (2026)
por: Zheng, Yusheng
Publicado: (2026)
LEFT-RS: A Lock-Free Fault-Tolerant Resource Sharing Protocol for Multicore Real-Time Systems
por: Chen, Nan, et al.
Publicado: (2025)
por: Chen, Nan, et al.
Publicado: (2025)
RAGDoll: Efficient Offloading-based Online RAG System on a Single GPU
por: Yu, Weiping, et al.
Publicado: (2025)
por: Yu, Weiping, et al.
Publicado: (2025)
TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics
por: Tan, Difan, et al.
Publicado: (2026)
por: Tan, Difan, et al.
Publicado: (2026)
HybridTier: an Adaptive and Lightweight CXL-Memory Tiering System
por: Song, Kevin, et al.
Publicado: (2023)
por: Song, Kevin, et al.
Publicado: (2023)
BLITZSCALE: Fast and Live Large Model Autoscaling with O(1) Host Caching
por: Zhang, Dingyan, et al.
Publicado: (2024)
por: Zhang, Dingyan, et al.
Publicado: (2024)
TrEnv-X: Transparently Share Serverless Execution Environments Across Different Functions and Nodes
por: Huang, Jialiang, et al.
Publicado: (2025)
por: Huang, Jialiang, et al.
Publicado: (2025)
Equinox: Decentralized Scheduling for Hardware-Aware Orbital Intelligence
por: Erol, Ansel Kaplan, et al.
Publicado: (2026)
por: Erol, Ansel Kaplan, et al.
Publicado: (2026)
Mitigating context switching in densely packed Linux clusters with Latency-Aware Group Scheduling
por: Isstaif, Al Amjad Tawfiq, et al.
Publicado: (2025)
por: Isstaif, Al Amjad Tawfiq, et al.
Publicado: (2025)
DPC: A Distributed Page Cache over CXL
por: Bergman, Shai, et al.
Publicado: (2026)
por: Bergman, Shai, et al.
Publicado: (2026)
EdgeFlow: Fast Cold Starts for LLMs on Mobile Devices
por: Yan, Yongsheng, et al.
Publicado: (2026)
por: Yan, Yongsheng, et al.
Publicado: (2026)
Unlocking True Elasticity for the Cloud-Native Era with Dandelion
por: Kuchler, Tom, et al.
Publicado: (2025)
por: Kuchler, Tom, et al.
Publicado: (2025)
A Periodic Space of Distributed Computing: Vision & Framework
por: Salehi, Mohsen Amini, et al.
Publicado: (2026)
por: Salehi, Mohsen Amini, et al.
Publicado: (2026)
THEMIS: Time, Heterogeneity, and Energy Minded Scheduling for Fair Multi-Tenant Use in FPGAs
por: Karabulut, Emre, et al.
Publicado: (2024)
por: Karabulut, Emre, et al.
Publicado: (2024)
Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems
por: Wang, Chen, et al.
Publicado: (2024)
por: Wang, Chen, et al.
Publicado: (2024)
Mewz: Lightweight Execution Environment for WebAssembly with High Isolation and Portability using Unikernels
por: Ueda, Soichiro, et al.
Publicado: (2024)
por: Ueda, Soichiro, et al.
Publicado: (2024)
Taming Serverless Cold Starts Through OS Co-Design
por: Holmes, Ben, et al.
Publicado: (2025)
por: Holmes, Ben, et al.
Publicado: (2025)
Fix: externalizing network I/O in serverless computing
por: Deng, Yuhan, et al.
Publicado: (2025)
por: Deng, Yuhan, et al.
Publicado: (2025)
Optimizing Task Scheduling in Heterogeneous Computing Environments: A Comparative Analysis of CPU, GPU, and ASIC Platforms Using E2C Simulator
por: Mohammadjafari, Ali, et al.
Publicado: (2024)
por: Mohammadjafari, Ali, et al.
Publicado: (2024)
Performance Isolation and Semantic Determinism in Efficient GPU Spatial Sharing
por: Yang, Zhenyuan, et al.
Publicado: (2026)
por: Yang, Zhenyuan, et al.
Publicado: (2026)
Telepathic Datacenters: Fast RPCs using Shared CXL Memory
por: Mahar, Suyash, et al.
Publicado: (2024)
por: Mahar, Suyash, et al.
Publicado: (2024)
Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale
por: Zhao, Kaiyang, et al.
Publicado: (2026)
por: Zhao, Kaiyang, et al.
Publicado: (2026)
CvxCluster: Solving Large, Complex, Granular Resource Allocation Problems 100-1000x Faster
por: Nnorom Jr, Obi, et al.
Publicado: (2026)
por: Nnorom Jr, Obi, et al.
Publicado: (2026)
Towards Efficient and Practical GPU Multitasking in the Era of LLM
por: Xing, Jiarong, et al.
Publicado: (2025)
por: Xing, Jiarong, et al.
Publicado: (2025)
Agent Centric Operating System -- a Comprehensive Review and Outlook for Operating System
por: Jia, Shian, et al.
Publicado: (2024)
por: Jia, Shian, et al.
Publicado: (2024)
GPUOS: A GPU Operating System Primitive for Transparent Operation Fusion
por: Yang, Yiwei, et al.
Publicado: (2026)
por: Yang, Yiwei, et al.
Publicado: (2026)
"Range as a Key" is the Key! Fast and Compact Cloud Block Store Index with RASK
por: Zhao, Haoru, et al.
Publicado: (2026)
por: Zhao, Haoru, et al.
Publicado: (2026)
Funky: Cloud-Native FPGA Virtualization and Orchestration
por: Koshiba, Atsushi, et al.
Publicado: (2025)
por: Koshiba, Atsushi, et al.
Publicado: (2025)
Ensuring Data Freshness in Multi-Rate Task Chains Scheduling
por: Hoffmann, José Luis Conradi, et al.
Publicado: (2026)
por: Hoffmann, José Luis Conradi, et al.
Publicado: (2026)
Rethinking Inter-Process Communication with Memory Operation Offloading
por: Park, Misun, et al.
Publicado: (2026)
por: Park, Misun, et al.
Publicado: (2026)
Why iCloud Fails: The Category Mistake of Cloud Synchronization
por: Borrill, Paul
Publicado: (2026)
por: Borrill, Paul
Publicado: (2026)
Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process Workloads
por: Roca, Aleix, et al.
Publicado: (2026)
por: Roca, Aleix, et al.
Publicado: (2026)
Peformance Isolation for Inference Processes in Edge GPU Systems
por: Martín, Juan José, et al.
Publicado: (2026)
por: Martín, Juan José, et al.
Publicado: (2026)
Characterizing Metastable Faults and Failures
por: Farahbakhsh, Ali, et al.
Publicado: (2026)
por: Farahbakhsh, Ali, et al.
Publicado: (2026)
Dirigent: Lightweight Serverless Orchestration
por: Cvetković, Lazar, et al.
Publicado: (2024)
por: Cvetković, Lazar, et al.
Publicado: (2024)
Toward Systems Foundations for Agentic Exploration
por: Xu, Jiakai, et al.
Publicado: (2025)
por: Xu, Jiakai, et al.
Publicado: (2025)
PhoenixOS: Concurrent OS-level GPU Checkpoint and Restore with Validated Speculation
por: Wei, Xingda, et al.
Publicado: (2024)
por: Wei, Xingda, et al.
Publicado: (2024)
Nixie: Efficient, Transparent Temporal Multiplexing for Consumer GPUs
por: Xu, Yechen, et al.
Publicado: (2026)
por: Xu, Yechen, et al.
Publicado: (2026)
ContiguousKV: Accelerating LLM Prefill with Granularity-Aligned KV Cache Management
por: Zou, Jing, et al.
Publicado: (2026)
por: Zou, Jing, et al.
Publicado: (2026)
Idiosyncrasies of Programmable Caching Engines
por: Peixoto, José, et al.
Publicado: (2026)
por: Peixoto, José, et al.
Publicado: (2026)
Ejemplares similares
-
NCCLbpf: Verified, Composable Policy Execution for GPU Collective Communication
por: Zheng, Yusheng
Publicado: (2026) -
LEFT-RS: A Lock-Free Fault-Tolerant Resource Sharing Protocol for Multicore Real-Time Systems
por: Chen, Nan, et al.
Publicado: (2025) -
RAGDoll: Efficient Offloading-based Online RAG System on a Single GPU
por: Yu, Weiping, et al.
Publicado: (2025) -
TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics
por: Tan, Difan, et al.
Publicado: (2026) -
HybridTier: an Adaptive and Lightweight CXL-Memory Tiering System
por: Song, Kevin, et al.
Publicado: (2023)