:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Yuan, Li, Mingyu, Chen, Haibo
Format:	Preprint
Published:	2025
Subjects:	Operating Systems Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2510.04607
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents
by: Abhyankar, Reyna, et al.
Published: (2025)

Semantic Scheduling for LLM Inference
by: Hua, Wenyue, et al.
Published: (2025)

Towards Agentic OS: An LLM Agent Framework for Linux Schedulers
by: Zheng, Yusheng, et al.
Published: (2025)

EVICPRESS: Joint KV-Cache Compression and Eviction for Efficient LLM Serving
by: Feng, Shaoting, et al.
Published: (2025)

Neuralink: Fast LLM Inference on Smartphones with Neuron Co-Activation Linking
by: Wang, Tuowei, et al.
Published: (2024)

AdaptCache: KV Cache Native Storage Hierarchy for Low-Delay and High-Quality Language Model Serving
by: Feng, Shaoting, et al.
Published: (2025)

Preparation Meets Opportunity: Enhancing Data Preprocessing for ML Training With Seneca
by: Desai, Omkar, et al.
Published: (2025)

An Integrated Artificial Intelligence Operating System for Advanced Low-Altitude Aviation Applications
by: Tan, Minzhe, et al.
Published: (2024)

AgentCgroup: Understanding and Controlling OS Resources of AI Agents
by: Zheng, Yusheng, et al.
Published: (2026)

Sawtooth Wavefront Reordering: Enhanced CuTile FlashAttention on NVIDIA GB10
by: Zhu, Yifan, et al.
Published: (2026)

TClone: Low-Latency Forking of Live GUI Environments for Computer-Use Agents
by: Huang, Yutong, et al.
Published: (2026)

AIOS: LLM Agent Operating System
by: Mei, Kai, et al.
Published: (2024)

Enhancing Battery Storage Energy Arbitrage with Deep Reinforcement Learning and Time-Series Forecasting
by: Sage, Manuel, et al.
Published: (2024)

Secure and Efficient Access Control for Computer-Use Agents via Context Space
by: Gong, Haochen, et al.
Published: (2025)

DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback
by: Dong, Yunpeng, et al.
Published: (2026)

LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOS
by: Mei, Kai, et al.
Published: (2025)

UFO2: The Desktop AgentOS
by: Zhang, Chaoyun, et al.
Published: (2025)

Hardware-Assisted Virtualization of Neural Processing Units for Cloud Platforms
by: Xue, Yuqi, et al.
Published: (2024)

PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
by: Song, Yixin, et al.
Published: (2023)

Composable OS Kernel Architectures for Autonomous Intelligence
by: Singh, Rajpreet, et al.
Published: (2025)

Leveraging Machine Learning for Accurate IoT Device Identification in Dynamic Wireless Contexts
by: Tushir, Bhagyashri, et al.
Published: (2024)

Energy-Efficient Computation with DVFS using Deep Reinforcement Learning for Multi-Task Systems in Edge Computing
by: Li, Xinyi, et al.
Published: (2024)

VUDA: Breaking CUDA-Vulkan Isolation for Spatial Sharing of Compute and Graphics on the Same GPU
by: Xu, Bin, et al.
Published: (2026)

Herding LLaMaS: Using LLMs as an OS Module
by: Kamath, Aditya K, et al.
Published: (2024)

Crash-Consistent Checkpointing for AI Training on macOS/APFS
by: Jeon, Juha
Published: (2025)

LithOS: An Operating System for Efficient Machine Learning on GPUs
by: Coppock, Patrick H., et al.
Published: (2025)

Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
by: Wu, Chenpeng, et al.
Published: (2025)

ConsumerBench: Benchmarking Generative AI Applications on End-User Devices
by: Gu, Yile, et al.
Published: (2025)

Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
by: Kamahori, Keisuke, et al.
Published: (2024)

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes
by: Wu, Tianyuan, et al.
Published: (2026)

When eBPF Meets Machine Learning: On-the-fly OS Kernel Compartmentalization
by: Wang, Zicheng, et al.
Published: (2024)

MaLV-OS: Rethinking the Operating System Architecture for Machine Learning in Virtualized Clouds
by: Bitchebe, Stella, et al.
Published: (2025)

SemaTune: Semantic-Aware Online OS Tuning with Large Language Models
by: Liargkovas, Georgios, et al.
Published: (2026)

Skim: Speculative Execution for Fast and Efficient Web Agents
by: Wong, Mike, et al.
Published: (2026)

Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
by: Li, Hanchen, et al.
Published: (2025)

NaSh: Guardrails for an LLM-Powered Natural Language Shell
by: Gyawali, Bimal Raj, et al.
Published: (2025)

FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference
by: Du, Hongchao, et al.
Published: (2025)

TrustAgent: Towards Safe and Trustworthy LLM-based Agents
by: Hua, Wenyue, et al.
Published: (2024)

Quine: Realizing LLM Agents as Native POSIX Processes
by: Ke, Hao
Published: (2026)

Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation
by: Agarwal, Shubham, et al.
Published: (2025)