:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Zongpu, Dash, Pranab, Hu, Y. Charlie, Xu, Qiang, Li, Jian, Guan, Haibing
Format:	Preprint
Published:	2025
Subjects:	Operating Systems Computation and Language
Online Access:	https://arxiv.org/abs/2507.02135
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

RTP-LLM: High-Performance Alibaba LLM Inference Engine
by: Tan, Boyu, et al.
Published: (2026)

Energy-Efficient Computation with DVFS using Deep Reinforcement Learning for Multi-Task Systems in Edge Computing
by: Li, Xinyi, et al.
Published: (2024)

Dissecting CXL Memory Performance at Scale: Analysis, Modeling, and Optimization
by: Liu, Jinshu, et al.
Published: (2024)

AIOS: LLM Agent Operating System
by: Mei, Kai, et al.
Published: (2024)

LLM as a System Service on Mobile Devices
by: Yin, Wangsong, et al.
Published: (2024)

VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM
by: Jin, Lesheng, et al.
Published: (2025)

FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference
by: Du, Hongchao, et al.
Published: (2025)

SSV: Sparse Speculative Verification for Efficient LLM Inference
by: Wang, Zhibin, et al.
Published: (2026)

GoCkpt: Gradient-Assisted Multi-Step overlapped Checkpointing for Efficient LLM Training
by: Zhang, Keyao, et al.
Published: (2025)

Potential of WebAssembly for Embedded Systems
by: Wallentowitz, Stefan, et al.
Published: (2024)

OBASE: Object-Based Address-Space Engineering to Improve Memory Tiering
by: Banakar, Vinay, et al.
Published: (2026)

Getting a Handle on Unmanaged Memory
by: Wanninger, Nick, et al.
Published: (2024)

Scaling Inter-procedural Dataflow Analysis on the Cloud
by: Sun, Zewen, et al.
Published: (2024)

Tutti: Making SSD-Backed KV Cache Practical for Long-Context LLM Serving
by: Qiu, Shi, et al.
Published: (2026)

Quine: Realizing LLM Agents as Native POSIX Processes
by: Ke, Hao
Published: (2026)

Horizon-LM: A RAM-Centric Architecture for LLM Training
by: Yuan, Zhengqing, et al.
Published: (2026)

Cerebrum (AIOS SDK): A Platform for Agent Development, Deployment, Distribution, and Discovery
by: Rama, Balaji, et al.
Published: (2025)

Tidying Up the Address Space
by: Banakar, Vinay, et al.
Published: (2025)

Flare: Anomaly Diagnostics for Divergent LLM Training in GPU Clusters of Thousand-Plus Scale
by: Cui, Weihao, et al.
Published: (2025)

MNN-AECS: Energy Optimization for LLM Decoding on Mobile Devices via Adaptive Core Selection
by: Huang, Zhengxiang, et al.
Published: (2025)

Scalable and Accurate Application-Level Crash-Consistency Testing via Representative Testing
by: Gu, Yile, et al.
Published: (2025)

Assessing FIFO and Round Robin Scheduling:Effects on Data Pipeline Performance and Energy Usage
by: Choudhury, Malobika Roy, et al.
Published: (2024)

Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
by: Wu, Chenpeng, et al.
Published: (2025)

Principled Performance Tunability in Operating System Kernels
by: Chen, Zhongjie, et al.
Published: (2025)

WebAssembly on Resource-Constrained IoT Devices: Performance, Efficiency, and Portability
by: Has, Mislav, et al.
Published: (2025)

ThunderAgent: A Simple, Fast and Program-Aware Agentic Inference System
by: Kang, Hao, et al.
Published: (2026)

Semantic Scheduling for LLM Inference
by: Hua, Wenyue, et al.
Published: (2025)

Valve: Production Online-Offline Inference Colocation with Jointly-Bounded Preemption Latency and Rate
by: Liu, Fangyue, et al.
Published: (2026)

ASC-Hook: fast and transparent system call hook for Arm
by: Shen, Yang, et al.
Published: (2024)

Compiling Away the Overhead of Race Detection
by: Paznikov, Alexey, et al.
Published: (2025)

Sockeye: a language for analyzing hardware documentation
by: Fiedler, Ben, et al.
Published: (2025)

Futureproof Static Memory Planning
by: Lamprakos, Christos, et al.
Published: (2025)

vNV-Heap: An Ownership-Based Virtually Non-Volatile Heap for Embedded Systems
by: Gerber, Markus Elias, et al.
Published: (2025)

Safe and usable kernel extensions with Rex
by: Jia, Jinghao, et al.
Published: (2025)

Towards Agentic OS: An LLM Agent Framework for Linux Schedulers
by: Zheng, Yusheng, et al.
Published: (2025)

Clove: Object-Level CXL Memory Management in Managed Runtimes
by: Son, Sam, et al.
Published: (2026)

Decoupling Vector Data and Index Storage for Space Efficiency
by: Ren, Yuanming, et al.
Published: (2026)

Towards High-Goodput LLM Serving with Prefill-decode Multiplexing
by: Chen, Yukang, et al.
Published: (2025)

Revitalising the Single Batch Environment: A 'Quest' to Achieve Fairness and Efficiency
by: Manna, Supriya, et al.
Published: (2023)

Vmem: A Lightweight Hot-Upgradable Memory Management for In-production Cloud Environment
by: Zheng, Hao, et al.
Published: (2025)