Saved in:
| Main Authors: | Jia, Fucheng, Jiang, Shiqi, Cao, Ting, Cui, Wei, Xia, Tianrui, Cao, Xu, Li, Yuanchun, Zhang, Deyu, Ren, Ju, Liu, Yunxin, Qiu, Lili, Yang, Mao |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2309.08978 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Scaling Up On-Device LLMs via Active-Weight Swapping Between DRAM and Flash
by: Jia, Fucheng, et al.
Published: (2025)
by: Jia, Fucheng, et al.
Published: (2025)
Anatomizing Deep Learning Inference in Web Browsers
by: Wang, Qipeng, et al.
Published: (2024)
by: Wang, Qipeng, et al.
Published: (2024)
A First Look At Efficient And Secure On-Device LLM Inference Against KV Leakage
by: Yang, Huan, et al.
Published: (2024)
by: Yang, Huan, et al.
Published: (2024)
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices
by: Li, Xiangyu, et al.
Published: (2025)
by: Li, Xiangyu, et al.
Published: (2025)
ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration
by: Dai, Gaole, et al.
Published: (2025)
by: Dai, Gaole, et al.
Published: (2025)
Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment
by: Dai, Gaole, et al.
Published: (2025)
by: Dai, Gaole, et al.
Published: (2025)
AgentProg: Empowering Long-Horizon GUI Agents with Program-Guided Context Management
by: Tian, Shizuo, et al.
Published: (2025)
by: Tian, Shizuo, et al.
Published: (2025)
Sculptor: Empowering LLMs with Cognitive Agency via Active Context Management
by: Li, Mo, et al.
Published: (2025)
by: Li, Mo, et al.
Published: (2025)
AdaNav: Adaptive Reasoning with Uncertainty for Vision-Language Navigation
by: Ding, Xin, et al.
Published: (2025)
by: Ding, Xin, et al.
Published: (2025)
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
by: Wang, Jinheng, et al.
Published: (2025)
by: Wang, Jinheng, et al.
Published: (2025)
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism Transformation
by: Wang, Tuowei, et al.
Published: (2024)
by: Wang, Tuowei, et al.
Published: (2024)
EdgeFlex-Transformer: Transformer Inference for Edge Devices
by: Mohammad, Shoaib, et al.
Published: (2025)
by: Mohammad, Shoaib, et al.
Published: (2025)
EdgeShard: Efficient LLM Inference via Collaborative Edge Computing
by: Zhang, Mingjin, et al.
Published: (2024)
by: Zhang, Mingjin, et al.
Published: (2024)
Babel: A Scalable Pre-trained Model for Multi-Modal Sensing via Expandable Modality Alignment
by: Dai, Shenghong, et al.
Published: (2024)
by: Dai, Shenghong, et al.
Published: (2024)
Threshold Neuron: A Brain-inspired Artificial Neuron for Efficient On-device Inference
by: Zheng, Zihao, et al.
Published: (2024)
by: Zheng, Zihao, et al.
Published: (2024)
KVShare: An LLM Service System with Efficient and Effective Multi-Tenant KV Cache Reuse
by: Yang, Huan, et al.
Published: (2025)
by: Yang, Huan, et al.
Published: (2025)
Region-based Content Enhancement for Efficient Video Analytics at the Edge
by: Wang, Weijun, et al.
Published: (2024)
by: Wang, Weijun, et al.
Published: (2024)
OxyGen: Unified KV Cache Management for VLA Inference under Multi-Task Parallelism
by: Li, Xiangyu, et al.
Published: (2026)
by: Li, Xiangyu, et al.
Published: (2026)
SpinFlow: A Physics-Informed Spin Field Framework for Traffic Phase Inference and Transition Detection
by: Deng, Haopeng, et al.
Published: (2026)
by: Deng, Haopeng, et al.
Published: (2026)
EdgeOAR: Real-time Online Action Recognition On Edge Devices
by: Luo, Wei, et al.
Published: (2024)
by: Luo, Wei, et al.
Published: (2024)
Browser Fingerprint Detection and Anti-Tracking
by: Lin, Kaitong, et al.
Published: (2025)
by: Lin, Kaitong, et al.
Published: (2025)
Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation
by: Cao, Pu, et al.
Published: (2023)
by: Cao, Pu, et al.
Published: (2023)
AVA: Towards Agentic Video Analytics with Vision Language Models
by: Yan, Yuxuan, et al.
Published: (2025)
by: Yan, Yuxuan, et al.
Published: (2025)
ResPanDiff: Diffusion Model for Pansharpening by Inferring Residual Inference
by: Cao, Shiqi, et al.
Published: (2025)
by: Cao, Shiqi, et al.
Published: (2025)
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget
by: Wang, Kun, et al.
Published: (2024)
by: Wang, Kun, et al.
Published: (2024)
Chain-of-Restoration: Multi-Task Image Restoration Models are Zero-Shot Step-by-Step Universal Image Restorers
by: Cao, Jin, et al.
Published: (2024)
by: Cao, Jin, et al.
Published: (2024)
Edge Graph Intelligence: Reciprocally Empowering Edge Networks with Graph Intelligence
by: Zeng, Liekang, et al.
Published: (2024)
by: Zeng, Liekang, et al.
Published: (2024)
Vision Enhancing LLMs: Empowering Multimodal Knowledge Storage and Sharing in LLMs
by: Li, Yunxin, et al.
Published: (2023)
by: Li, Yunxin, et al.
Published: (2023)
Privacy-Aware Multi-Device Cooperative Edge Inference with Distributed Resource Bidding
by: Zhuang, Wenhao, et al.
Published: (2024)
by: Zhuang, Wenhao, et al.
Published: (2024)
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
by: Cao, Deyu, et al.
Published: (2025)
by: Cao, Deyu, et al.
Published: (2025)
GenAI at the Edge: Comprehensive Survey on Empowering Edge Devices
by: Navardi, Mozhgan, et al.
Published: (2025)
by: Navardi, Mozhgan, et al.
Published: (2025)
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
by: Wang, Xubin, et al.
Published: (2025)
by: Wang, Xubin, et al.
Published: (2025)
GRIP-VLM: Group-Relative Importance Pruning for Efficient Vision-Language Models
by: Huang, Mingzhe, et al.
Published: (2026)
by: Huang, Mingzhe, et al.
Published: (2026)
Infer-EDGE: Dynamic DNN Inference Optimization in 'Just-in-time' Edge-AI Implementations
by: Mounesan, Motahare, et al.
Published: (2025)
by: Mounesan, Motahare, et al.
Published: (2025)
Recall: Empowering Multimodal Embedding for Edge Devices
by: Cai, Dongqi, et al.
Published: (2024)
by: Cai, Dongqi, et al.
Published: (2024)
BudgetThinker: Empowering Budget-aware LLM Reasoning with Control Tokens
by: Wen, Hao, et al.
Published: (2025)
by: Wen, Hao, et al.
Published: (2025)
PracMHBench: Re-evaluating Model-Heterogeneous Federated Learning Based on Practical Edge Device Constraints
by: Guo, Yuanchun, et al.
Published: (2025)
by: Guo, Yuanchun, et al.
Published: (2025)
WebLLM: A High-Performance In-Browser LLM Inference Engine
by: Ruan, Charlie F., et al.
Published: (2024)
by: Ruan, Charlie F., et al.
Published: (2024)
Em-Garde: A Propose-Match Framework for Proactive Streaming Video Understanding
by: Zheng, Yikai, et al.
Published: (2026)
by: Zheng, Yikai, et al.
Published: (2026)
Making Every Frame Matter: Continuous Activity Recognition in Streaming Video via Adaptive Video Context Modeling
by: Wu, Hao, et al.
Published: (2024)
by: Wu, Hao, et al.
Published: (2024)
Similar Items
-
Scaling Up On-Device LLMs via Active-Weight Swapping Between DRAM and Flash
by: Jia, Fucheng, et al.
Published: (2025) -
Anatomizing Deep Learning Inference in Web Browsers
by: Wang, Qipeng, et al.
Published: (2024) -
A First Look At Efficient And Secure On-Device LLM Inference Against KV Leakage
by: Yang, Huan, et al.
Published: (2024) -
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices
by: Li, Xiangyu, et al.
Published: (2025) -
ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration
by: Dai, Gaole, et al.
Published: (2025)