:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Goyal, Harshit
Format:	Preprint
Published:	2025
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.05127
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Automated Planning for Optimal Data Pipeline Instantiation
by: Amado, Leonardo Rosa, et al.
Published: (2025)

The AI_INFN Platform: Artificial Intelligence Development in the Cloud
by: Anderlini, Lucio, et al.
Published: (2025)

Envisioning National Resources for Artificial Intelligence Research: NSF Workshop Report
by: Jha, Shantenu, et al.
Published: (2024)

A Blockchain and Artificial Intelligence based System for Halal Food Traceability
by: Alourani, Abdulla, et al.
Published: (2024)

What Artificial Intelligence can do for High-Performance Computing systems?
by: Pochelu, Pierrick, et al.
Published: (2026)

Token-Budget-Aware Pool Routing for Cost-Efficient LLM Inference
by: Chen, Huamin, et al.
Published: (2026)

Isambard-AI: a leadership class supercomputer optimised specifically for Artificial Intelligence
by: McIntosh-Smith, Simon, et al.
Published: (2024)

AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models
by: Guo, Jihu, et al.
Published: (2025)

AI4EOSC: a Federated Cloud Platform for Artificial Intelligence in Scientific Research
by: Heredia, Ignacio, et al.
Published: (2025)

Intelligent Autonomous Orchestration for Distributed Cloud Resources using Complex-Stability Analysis
by: Shyam, Gopal Krishna, et al.
Published: (2026)

Blockchain and Artificial Intelligence: Synergies and Conflicts
by: Witt, Leon, et al.
Published: (2024)

ModServe: Modality- and Stage-Aware Resource Disaggregation for Scalable Multimodal Model Serving
by: Qiu, Haoran, et al.
Published: (2025)

FreeRide: Harvesting Bubbles in Pipeline Parallelism
by: Zhang, Jiashu, et al.
Published: (2024)

Electricity Cost Minimization for Multi-Workflow Allocation in Geo-Distributed Data Centers
by: Wang, Shuang, et al.
Published: (2025)

B-PASTE: Beam-Aware Pattern-Guided Speculative Execution for Resource-Constrained LLM Agents
by: Song, Yanfei
Published: (2026)

Scalable Cloud-Native Architectures for Intelligent PMU Data Processing
by: Chockalingam, Nachiappan, et al.
Published: (2025)

Scepsy: Serving Agentic Workflows Using Aggregate LLM Pipelines
by: Wagenländer, Marcel, et al.
Published: (2026)

TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism
by: Cho, Seonghye, et al.
Published: (2026)

DIP: Efficient Large Multimodal Model Training with Dynamic Interleaved Pipeline
by: Xue, Zhenliang, et al.
Published: (2025)

Understand and Accelerate Memory Processing Pipeline for Large Language Model Inference
by: He, Zifan, et al.
Published: (2026)

CurvFed: Curvature-Aligned Federated Learning for Fairness without Demographics
by: Sharma, Harshit, et al.
Published: (2024)

Towards Carbon-Aware Container Orchestration: Predicting Workload Energy Consumption with Federated Learning
by: Saad, Zainab, et al.
Published: (2025)

FlowSpec: Continuous Pipelined Speculative Decoding for Efficient Distributed LLM Inference
by: Liu, Xing, et al.
Published: (2025)

WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows
by: Paul, Taylor, et al.
Published: (2026)

Building a Correct-by-Design Lakehouse. Data Contracts, Versioning, and Transactional Pipelines for Humans and Agents
by: Sheng, Weiming, et al.
Published: (2026)

InfiniPipe: Elastic Pipeline Parallelism for Efficient Variable-Length Long-Context LLM Training
by: Wang, Shiju, et al.
Published: (2025)

Research on the Application of Spark Streaming Real-Time Data Analysis System and large language model Intelligent Agents
by: Wang, Jialin, et al.
Published: (2024)

Scalable Artificial Intelligence for Science: Perspectives, Methods and Exemplars
by: Brewer, Wesley, et al.
Published: (2024)

A Review on Building Blocks of Decentralized Artificial Intelligence
by: Kersic, Vid, et al.
Published: (2024)

Joint Resource Optimization, Computation Offloading and Resource Slicing for Multi-Edge Traffic-Cognitive Networks
by: Xiaoyang, Ting, et al.
Published: (2024)

Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism
by: Dash, Sajal, et al.
Published: (2026)

ENOVA: Autoscaling towards Cost-effective and Stable Serverless LLM Serving
by: Huang, Tao, et al.
Published: (2024)

Towards Resource-Efficient Compound AI Systems
by: Chaudhry, Gohar Irfan, et al.
Published: (2025)

Joint Model Assignment and Resource Allocation for Cost-Effective Mobile Generative Services
by: Gao, Shuangwei, et al.
Published: (2024)

Remoe: Towards Efficient and Low-Cost MoE Inference in Serverless Computing
by: Liu, Wentao, et al.
Published: (2025)

AIBrix: Towards Scalable, Cost-Effective Large Language Model Inference Infrastructure
by: The AIBrix Team, et al.
Published: (2025)

TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training
by: Wu, Houming, et al.
Published: (2025)

Analytically-Driven Resource Management for Cloud-Native Microservices
by: Zhang, Yanqi, et al.
Published: (2024)

Dynamic Scheduling Strategies for Resource Optimization in Computing Environments
by: Wang, Xiaoye
Published: (2024)

Quantifying Energy and Cost Benefits of Hybrid Edge Cloud: Analysis of Traditional and Agentic Workloads
by: Alamouti, Siavash
Published: (2025)