Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Qu, Guanqiao, Chen, Qiyuan, Wei, Wei, Lin, Zheng, Chen, Xianhao, Huang, Kaibin
Format:	Preprint
Published:	2024
Subjects:	Networking and Internet Architecture Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2407.18921
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913746405818368
author	Qu, Guanqiao Chen, Qiyuan Wei, Wei Lin, Zheng Chen, Xianhao Huang, Kaibin
author_facet	Qu, Guanqiao Chen, Qiyuan Wei, Wei Lin, Zheng Chen, Xianhao Huang, Kaibin
contents	On-device large language models (LLMs), referring to running LLMs on edge devices, have raised considerable interest since they are more cost-effective, latency-efficient, and privacy-preserving compared with the cloud paradigm. Nonetheless, the performance of on-device LLMs is intrinsically constrained by resource limitations on edge devices. Sitting between cloud and on-device AI, mobile edge intelligence (MEI) presents a viable solution by provisioning AI capabilities at the edge of mobile networks, enabling end users to offload heavy AI computation to capable edge servers nearby. This article provides a contemporary survey on harnessing MEI for LLMs. We begin by illustrating several killer applications to demonstrate the urgent need for deploying LLMs at the network edge. Next, we present the preliminaries of LLMs and MEI, followed by resource-efficient LLM techniques. We then present an architectural overview of MEI for LLMs (MEI4LLM), outlining its core components and how it supports the deployment of LLMs. Subsequently, we delve into various aspects of MEI4LLM, extensively covering edge LLM caching and delivery, edge LLM training, and edge LLM inference. Finally, we identify future research opportunities. We hope this article inspires researchers in the field to leverage mobile edge computing to facilitate LLM deployment, thereby unleashing the potential of LLMs across various privacy- and delay-sensitive applications.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_18921
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Mobile Edge Intelligence for Large Language Models: A Contemporary Survey Qu, Guanqiao Chen, Qiyuan Wei, Wei Lin, Zheng Chen, Xianhao Huang, Kaibin Networking and Internet Architecture Artificial Intelligence Machine Learning On-device large language models (LLMs), referring to running LLMs on edge devices, have raised considerable interest since they are more cost-effective, latency-efficient, and privacy-preserving compared with the cloud paradigm. Nonetheless, the performance of on-device LLMs is intrinsically constrained by resource limitations on edge devices. Sitting between cloud and on-device AI, mobile edge intelligence (MEI) presents a viable solution by provisioning AI capabilities at the edge of mobile networks, enabling end users to offload heavy AI computation to capable edge servers nearby. This article provides a contemporary survey on harnessing MEI for LLMs. We begin by illustrating several killer applications to demonstrate the urgent need for deploying LLMs at the network edge. Next, we present the preliminaries of LLMs and MEI, followed by resource-efficient LLM techniques. We then present an architectural overview of MEI for LLMs (MEI4LLM), outlining its core components and how it supports the deployment of LLMs. Subsequently, we delve into various aspects of MEI4LLM, extensively covering edge LLM caching and delivery, edge LLM training, and edge LLM inference. Finally, we identify future research opportunities. We hope this article inspires researchers in the field to leverage mobile edge computing to facilitate LLM deployment, thereby unleashing the potential of LLMs across various privacy- and delay-sensitive applications.
title	Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
topic	Networking and Internet Architecture Artificial Intelligence Machine Learning
url	https://arxiv.org/abs/2407.18921

Similar Items