Saved in:
| Main Authors: | Wang, Yuan, Li, Mingyu, Chen, Haibo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.04607 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents
by: Abhyankar, Reyna, et al.
Published: (2025)
by: Abhyankar, Reyna, et al.
Published: (2025)
Semantic Scheduling for LLM Inference
by: Hua, Wenyue, et al.
Published: (2025)
by: Hua, Wenyue, et al.
Published: (2025)
Towards Agentic OS: An LLM Agent Framework for Linux Schedulers
by: Zheng, Yusheng, et al.
Published: (2025)
by: Zheng, Yusheng, et al.
Published: (2025)
EVICPRESS: Joint KV-Cache Compression and Eviction for Efficient LLM Serving
by: Feng, Shaoting, et al.
Published: (2025)
by: Feng, Shaoting, et al.
Published: (2025)
Neuralink: Fast LLM Inference on Smartphones with Neuron Co-Activation Linking
by: Wang, Tuowei, et al.
Published: (2024)
by: Wang, Tuowei, et al.
Published: (2024)
AdaptCache: KV Cache Native Storage Hierarchy for Low-Delay and High-Quality Language Model Serving
by: Feng, Shaoting, et al.
Published: (2025)
by: Feng, Shaoting, et al.
Published: (2025)
Preparation Meets Opportunity: Enhancing Data Preprocessing for ML Training With Seneca
by: Desai, Omkar, et al.
Published: (2025)
by: Desai, Omkar, et al.
Published: (2025)
An Integrated Artificial Intelligence Operating System for Advanced Low-Altitude Aviation Applications
by: Tan, Minzhe, et al.
Published: (2024)
by: Tan, Minzhe, et al.
Published: (2024)
AgentCgroup: Understanding and Controlling OS Resources of AI Agents
by: Zheng, Yusheng, et al.
Published: (2026)
by: Zheng, Yusheng, et al.
Published: (2026)
Sawtooth Wavefront Reordering: Enhanced CuTile FlashAttention on NVIDIA GB10
by: Zhu, Yifan, et al.
Published: (2026)
by: Zhu, Yifan, et al.
Published: (2026)
TClone: Low-Latency Forking of Live GUI Environments for Computer-Use Agents
by: Huang, Yutong, et al.
Published: (2026)
by: Huang, Yutong, et al.
Published: (2026)
AIOS: LLM Agent Operating System
by: Mei, Kai, et al.
Published: (2024)
by: Mei, Kai, et al.
Published: (2024)
Enhancing Battery Storage Energy Arbitrage with Deep Reinforcement Learning and Time-Series Forecasting
by: Sage, Manuel, et al.
Published: (2024)
by: Sage, Manuel, et al.
Published: (2024)
Secure and Efficient Access Control for Computer-Use Agents via Context Space
by: Gong, Haochen, et al.
Published: (2025)
by: Gong, Haochen, et al.
Published: (2025)
DeltaBox: Scaling Stateful AI Agents with Millisecond-Level Sandbox Checkpoint/Rollback
by: Dong, Yunpeng, et al.
Published: (2026)
by: Dong, Yunpeng, et al.
Published: (2026)
LiteCUA: Computer as MCP Server for Computer-Use Agent on AIOS
by: Mei, Kai, et al.
Published: (2025)
by: Mei, Kai, et al.
Published: (2025)
UFO2: The Desktop AgentOS
by: Zhang, Chaoyun, et al.
Published: (2025)
by: Zhang, Chaoyun, et al.
Published: (2025)
Hardware-Assisted Virtualization of Neural Processing Units for Cloud Platforms
by: Xue, Yuqi, et al.
Published: (2024)
by: Xue, Yuqi, et al.
Published: (2024)
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
by: Song, Yixin, et al.
Published: (2023)
by: Song, Yixin, et al.
Published: (2023)
Composable OS Kernel Architectures for Autonomous Intelligence
by: Singh, Rajpreet, et al.
Published: (2025)
by: Singh, Rajpreet, et al.
Published: (2025)
Leveraging Machine Learning for Accurate IoT Device Identification in Dynamic Wireless Contexts
by: Tushir, Bhagyashri, et al.
Published: (2024)
by: Tushir, Bhagyashri, et al.
Published: (2024)
Energy-Efficient Computation with DVFS using Deep Reinforcement Learning for Multi-Task Systems in Edge Computing
by: Li, Xinyi, et al.
Published: (2024)
by: Li, Xinyi, et al.
Published: (2024)
VUDA: Breaking CUDA-Vulkan Isolation for Spatial Sharing of Compute and Graphics on the Same GPU
by: Xu, Bin, et al.
Published: (2026)
by: Xu, Bin, et al.
Published: (2026)
Herding LLaMaS: Using LLMs as an OS Module
by: Kamath, Aditya K, et al.
Published: (2024)
by: Kamath, Aditya K, et al.
Published: (2024)
Crash-Consistent Checkpointing for AI Training on macOS/APFS
by: Jeon, Juha
Published: (2025)
by: Jeon, Juha
Published: (2025)
LithOS: An Operating System for Efficient Machine Learning on GPUs
by: Coppock, Patrick H., et al.
Published: (2025)
by: Coppock, Patrick H., et al.
Published: (2025)
Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores
by: Wu, Chenpeng, et al.
Published: (2025)
by: Wu, Chenpeng, et al.
Published: (2025)
ConsumerBench: Benchmarking Generative AI Applications on End-User Devices
by: Gu, Yile, et al.
Published: (2025)
by: Gu, Yile, et al.
Published: (2025)
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
by: Kamahori, Keisuke, et al.
Published: (2024)
by: Kamahori, Keisuke, et al.
Published: (2024)
Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes
by: Wu, Tianyuan, et al.
Published: (2026)
by: Wu, Tianyuan, et al.
Published: (2026)
When eBPF Meets Machine Learning: On-the-fly OS Kernel Compartmentalization
by: Wang, Zicheng, et al.
Published: (2024)
by: Wang, Zicheng, et al.
Published: (2024)
MaLV-OS: Rethinking the Operating System Architecture for Machine Learning in Virtualized Clouds
by: Bitchebe, Stella, et al.
Published: (2025)
by: Bitchebe, Stella, et al.
Published: (2025)
SemaTune: Semantic-Aware Online OS Tuning with Large Language Models
by: Liargkovas, Georgios, et al.
Published: (2026)
by: Liargkovas, Georgios, et al.
Published: (2026)
Skim: Speculative Execution for Fast and Efficient Web Agents
by: Wong, Mike, et al.
Published: (2026)
by: Wong, Mike, et al.
Published: (2026)
Continuum: Efficient and Robust Multi-Turn LLM Agent Scheduling with KV Cache Time-to-Live
by: Li, Hanchen, et al.
Published: (2025)
by: Li, Hanchen, et al.
Published: (2025)
NaSh: Guardrails for an LLM-Powered Natural Language Shell
by: Gyawali, Bimal Raj, et al.
Published: (2025)
by: Gyawali, Bimal Raj, et al.
Published: (2025)
FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference
by: Du, Hongchao, et al.
Published: (2025)
by: Du, Hongchao, et al.
Published: (2025)
TrustAgent: Towards Safe and Trustworthy LLM-based Agents
by: Hua, Wenyue, et al.
Published: (2024)
by: Hua, Wenyue, et al.
Published: (2024)
Quine: Realizing LLM Agents as Native POSIX Processes
by: Ke, Hao
Published: (2026)
by: Ke, Hao
Published: (2026)
Cache-Craft: Managing Chunk-Caches for Efficient Retrieval-Augmented Generation
by: Agarwal, Shubham, et al.
Published: (2025)
by: Agarwal, Shubham, et al.
Published: (2025)
Similar Items
-
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents
by: Abhyankar, Reyna, et al.
Published: (2025) -
Semantic Scheduling for LLM Inference
by: Hua, Wenyue, et al.
Published: (2025) -
Towards Agentic OS: An LLM Agent Framework for Linux Schedulers
by: Zheng, Yusheng, et al.
Published: (2025) -
EVICPRESS: Joint KV-Cache Compression and Eviction for Efficient LLM Serving
by: Feng, Shaoting, et al.
Published: (2025) -
Neuralink: Fast LLM Inference on Smartphones with Neuron Co-Activation Linking
by: Wang, Tuowei, et al.
Published: (2024)