Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Park, Misun, Dubey, Richi, Yuan, Yifan, Kim, Nam Sung, Gavrilovska, Ada
Format:	Preprint
Published:	2026
Subjects:	Operating Systems Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2601.06331
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915719836336128
author	Park, Misun Dubey, Richi Yuan, Yifan Kim, Nam Sung Gavrilovska, Ada
author_facet	Park, Misun Dubey, Richi Yuan, Yifan Kim, Nam Sung Gavrilovska, Ada
contents	As multimodal and AI-driven services exchange hundreds of megabytes per request, existing IPC runtimes spend a growing share of CPU cycles on memory copies. Although both hardware and software mechanisms are exploring memory offloading, current IPC stacks lack a unified runtime model to coordinate them effectively. This paper presents a unified IPC runtime suite that integrates both hardware- and software-based memory offloading into shared-memory communication. The system characterizes the interaction between offload strategies and IPC execution, including synchronization, cache visibility, and concurrency, and introduces multiple IPC modes that balance throughput, latency, and CPU efficiency. Through asynchronous pipelining, selective cache injection, and hybrid coordination, the system turns offloading from a device-specific feature into a general system capability. Evaluations on real-world workloads show instruction count reductions of up to 22%, throughput improvements of up to 2.1x, and latency reductions of up to 72%, demonstrating that coordinated IPC offloading can deliver tangible end-to-end efficiency gains in modern data-intensive systems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_06331
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Rethinking Inter-Process Communication with Memory Operation Offloading Park, Misun Dubey, Richi Yuan, Yifan Kim, Nam Sung Gavrilovska, Ada Operating Systems Distributed, Parallel, and Cluster Computing As multimodal and AI-driven services exchange hundreds of megabytes per request, existing IPC runtimes spend a growing share of CPU cycles on memory copies. Although both hardware and software mechanisms are exploring memory offloading, current IPC stacks lack a unified runtime model to coordinate them effectively. This paper presents a unified IPC runtime suite that integrates both hardware- and software-based memory offloading into shared-memory communication. The system characterizes the interaction between offload strategies and IPC execution, including synchronization, cache visibility, and concurrency, and introduces multiple IPC modes that balance throughput, latency, and CPU efficiency. Through asynchronous pipelining, selective cache injection, and hybrid coordination, the system turns offloading from a device-specific feature into a general system capability. Evaluations on real-world workloads show instruction count reductions of up to 22%, throughput improvements of up to 2.1x, and latency reductions of up to 72%, demonstrating that coordinated IPC offloading can deliver tangible end-to-end efficiency gains in modern data-intensive systems.
title	Rethinking Inter-Process Communication with Memory Operation Offloading
topic	Operating Systems Distributed, Parallel, and Cluster Computing
url	https://arxiv.org/abs/2601.06331

Similar Items