Saved in:
| Main Authors: | Park, Misun, Dubey, Richi, Yuan, Yifan, Kim, Nam Sung, Gavrilovska, Ada |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.06331 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Nexus: Transparent I/O Offloading for High-Density Serverless Computing
by: Park, JooYoung, et al.
Published: (2026)
by: Park, JooYoung, et al.
Published: (2026)
RAGDoll: Efficient Offloading-based Online RAG System on a Single GPU
by: Yu, Weiping, et al.
Published: (2025)
by: Yu, Weiping, et al.
Published: (2025)
GPUOS: A GPU Operating System Primitive for Transparent Operation Fusion
by: Yang, Yiwei, et al.
Published: (2026)
by: Yang, Yiwei, et al.
Published: (2026)
Mercury: QoS-Aware Tiered Memory System
by: Lu, Jiaheng, et al.
Published: (2024)
by: Lu, Jiaheng, et al.
Published: (2024)
Agent Centric Operating System -- a Comprehensive Review and Outlook for Operating System
by: Jia, Shian, et al.
Published: (2024)
by: Jia, Shian, et al.
Published: (2024)
Rethinking Thread Scheduling under Oversubscription: A User-Space Framework for Coordinating Multi-runtime and Multi-process Workloads
by: Roca, Aleix, et al.
Published: (2026)
by: Roca, Aleix, et al.
Published: (2026)
Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale
by: Zhao, Kaiyang, et al.
Published: (2026)
by: Zhao, Kaiyang, et al.
Published: (2026)
Telepathic Datacenters: Fast RPCs using Shared CXL Memory
by: Mahar, Suyash, et al.
Published: (2024)
by: Mahar, Suyash, et al.
Published: (2024)
HybridTier: an Adaptive and Lightweight CXL-Memory Tiering System
by: Song, Kevin, et al.
Published: (2023)
by: Song, Kevin, et al.
Published: (2023)
Peformance Isolation for Inference Processes in Edge GPU Systems
by: Martín, Juan José, et al.
Published: (2026)
by: Martín, Juan José, et al.
Published: (2026)
NCCLbpf: Verified, Composable Policy Execution for GPU Collective Communication
by: Zheng, Yusheng
Published: (2026)
by: Zheng, Yusheng
Published: (2026)
Towards Efficient and Practical GPU Multitasking in the Era of LLM
by: Xing, Jiarong, et al.
Published: (2025)
by: Xing, Jiarong, et al.
Published: (2025)
GPUVM: GPU-driven Unified Virtual Memory
by: Nazaraliyev, Nurlan, et al.
Published: (2024)
by: Nazaraliyev, Nurlan, et al.
Published: (2024)
ipc_shared_ptr: A Publish/Subscribe-Aware Smart Pointer for Cross-Process Object Lifetime Management
by: Ishikawa-Aso, Takahiro, et al.
Published: (2026)
by: Ishikawa-Aso, Takahiro, et al.
Published: (2026)
DPC: A Distributed Page Cache over CXL
by: Bergman, Shai, et al.
Published: (2026)
by: Bergman, Shai, et al.
Published: (2026)
EdgeFlow: Fast Cold Starts for LLMs on Mobile Devices
by: Yan, Yongsheng, et al.
Published: (2026)
by: Yan, Yongsheng, et al.
Published: (2026)
A Periodic Space of Distributed Computing: Vision & Framework
by: Salehi, Mohsen Amini, et al.
Published: (2026)
by: Salehi, Mohsen Amini, et al.
Published: (2026)
Performance Isolation and Semantic Determinism in Efficient GPU Spatial Sharing
by: Yang, Zhenyuan, et al.
Published: (2026)
by: Yang, Zhenyuan, et al.
Published: (2026)
CvxCluster: Solving Large, Complex, Granular Resource Allocation Problems 100-1000x Faster
by: Nnorom Jr, Obi, et al.
Published: (2026)
by: Nnorom Jr, Obi, et al.
Published: (2026)
"Range as a Key" is the Key! Fast and Compact Cloud Block Store Index with RASK
by: Zhao, Haoru, et al.
Published: (2026)
by: Zhao, Haoru, et al.
Published: (2026)
Ensuring Data Freshness in Multi-Rate Task Chains Scheduling
by: Hoffmann, José Luis Conradi, et al.
Published: (2026)
by: Hoffmann, José Luis Conradi, et al.
Published: (2026)
Why iCloud Fails: The Category Mistake of Cloud Synchronization
by: Borrill, Paul
Published: (2026)
by: Borrill, Paul
Published: (2026)
Characterizing Metastable Faults and Failures
by: Farahbakhsh, Ali, et al.
Published: (2026)
by: Farahbakhsh, Ali, et al.
Published: (2026)
Nixie: Efficient, Transparent Temporal Multiplexing for Consumer GPUs
by: Xu, Yechen, et al.
Published: (2026)
by: Xu, Yechen, et al.
Published: (2026)
ContiguousKV: Accelerating LLM Prefill with Granularity-Aligned KV Cache Management
by: Zou, Jing, et al.
Published: (2026)
by: Zou, Jing, et al.
Published: (2026)
Idiosyncrasies of Programmable Caching Engines
by: Peixoto, José, et al.
Published: (2026)
by: Peixoto, José, et al.
Published: (2026)
Fork, Explore, Commit: OS Primitives for Agentic Exploration
by: Wang, Cong, et al.
Published: (2026)
by: Wang, Cong, et al.
Published: (2026)
TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics
by: Tan, Difan, et al.
Published: (2026)
by: Tan, Difan, et al.
Published: (2026)
EdgeWeaver: Accelerating IoT Application Development Across Edge-Cloud Continuum
by: Lertpongrujikorn, Pawissanutt, et al.
Published: (2026)
by: Lertpongrujikorn, Pawissanutt, et al.
Published: (2026)
LMetric: Simple is Better - Multiplication May Be All You Need for LLM Request Scheduling
by: Zhang, Dingyan, et al.
Published: (2026)
by: Zhang, Dingyan, et al.
Published: (2026)
Mitigating context switching in densely packed Linux clusters with Latency-Aware Group Scheduling
by: Isstaif, Al Amjad Tawfiq, et al.
Published: (2025)
by: Isstaif, Al Amjad Tawfiq, et al.
Published: (2025)
Unlocking True Elasticity for the Cloud-Native Era with Dandelion
by: Kuchler, Tom, et al.
Published: (2025)
by: Kuchler, Tom, et al.
Published: (2025)
THEMIS: Time, Heterogeneity, and Energy Minded Scheduling for Fair Multi-Tenant Use in FPGAs
by: Karabulut, Emre, et al.
Published: (2024)
by: Karabulut, Emre, et al.
Published: (2024)
Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems
by: Wang, Chen, et al.
Published: (2024)
by: Wang, Chen, et al.
Published: (2024)
Mewz: Lightweight Execution Environment for WebAssembly with High Isolation and Portability using Unikernels
by: Ueda, Soichiro, et al.
Published: (2024)
by: Ueda, Soichiro, et al.
Published: (2024)
Taming Serverless Cold Starts Through OS Co-Design
by: Holmes, Ben, et al.
Published: (2025)
by: Holmes, Ben, et al.
Published: (2025)
Fix: externalizing network I/O in serverless computing
by: Deng, Yuhan, et al.
Published: (2025)
by: Deng, Yuhan, et al.
Published: (2025)
Optimizing Task Scheduling in Heterogeneous Computing Environments: A Comparative Analysis of CPU, GPU, and ASIC Platforms Using E2C Simulator
by: Mohammadjafari, Ali, et al.
Published: (2024)
by: Mohammadjafari, Ali, et al.
Published: (2024)
LEFT-RS: A Lock-Free Fault-Tolerant Resource Sharing Protocol for Multicore Real-Time Systems
by: Chen, Nan, et al.
Published: (2025)
by: Chen, Nan, et al.
Published: (2025)
Funky: Cloud-Native FPGA Virtualization and Orchestration
by: Koshiba, Atsushi, et al.
Published: (2025)
by: Koshiba, Atsushi, et al.
Published: (2025)
Similar Items
-
Nexus: Transparent I/O Offloading for High-Density Serverless Computing
by: Park, JooYoung, et al.
Published: (2026) -
RAGDoll: Efficient Offloading-based Online RAG System on a Single GPU
by: Yu, Weiping, et al.
Published: (2025) -
GPUOS: A GPU Operating System Primitive for Transparent Operation Fusion
by: Yang, Yiwei, et al.
Published: (2026) -
Mercury: QoS-Aware Tiered Memory System
by: Lu, Jiaheng, et al.
Published: (2024) -
Agent Centric Operating System -- a Comprehensive Review and Outlook for Operating System
by: Jia, Shian, et al.
Published: (2024)