Saved in:
Bibliographic Details
Main Authors: Gao, Shuangwei, Yang, Peng, Kong, Yuxin, Lyu, Feng, Zhang, Ning
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.09072
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929500052258816
author Gao, Shuangwei
Yang, Peng
Kong, Yuxin
Lyu, Feng
Zhang, Ning
author_facet Gao, Shuangwei
Yang, Peng
Kong, Yuxin
Lyu, Feng
Zhang, Ning
contents Artificial Intelligence Generated Content (AIGC) services can efficiently satisfy user-specified content creation demands, but the high computational requirements pose various challenges to supporting mobile users at scale. In this paper, we present our design of an edge-enabled AIGC service provisioning system to properly assign computing tasks of generative models to edge servers, thereby improving overall user experience and reducing content generation latency. Specifically, once the edge server receives user requested task prompts, it dynamically assigns appropriate models and allocates computing resources based on features of each category of prompts. The generated contents are then delivered to users. The key to this system is a proposed probabilistic model assignment approach, which estimates the quality score of generated contents for each prompt based on category labels. Next, we introduce a heuristic algorithm that enables adaptive configuration of both generation steps and resource allocation, according to the various task requests received by each generative model on the edge.Simulation results demonstrate that the designed system can effectively enhance the quality of generated content by up to 4.7% while reducing response delay by up to 39.1% compared to benchmarks.
format Preprint
id arxiv_https___arxiv_org_abs_2409_09072
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Joint Model Assignment and Resource Allocation for Cost-Effective Mobile Generative Services
Gao, Shuangwei
Yang, Peng
Kong, Yuxin
Lyu, Feng
Zhang, Ning
Distributed, Parallel, and Cluster Computing
Artificial Intelligence
Machine Learning
Artificial Intelligence Generated Content (AIGC) services can efficiently satisfy user-specified content creation demands, but the high computational requirements pose various challenges to supporting mobile users at scale. In this paper, we present our design of an edge-enabled AIGC service provisioning system to properly assign computing tasks of generative models to edge servers, thereby improving overall user experience and reducing content generation latency. Specifically, once the edge server receives user requested task prompts, it dynamically assigns appropriate models and allocates computing resources based on features of each category of prompts. The generated contents are then delivered to users. The key to this system is a proposed probabilistic model assignment approach, which estimates the quality score of generated contents for each prompt based on category labels. Next, we introduce a heuristic algorithm that enables adaptive configuration of both generation steps and resource allocation, according to the various task requests received by each generative model on the edge.Simulation results demonstrate that the designed system can effectively enhance the quality of generated content by up to 4.7% while reducing response delay by up to 39.1% compared to benchmarks.
title Joint Model Assignment and Resource Allocation for Cost-Effective Mobile Generative Services
topic Distributed, Parallel, and Cluster Computing
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2409.09072