:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Guerrero, Pablo Robin, Pan, Yueyang, Kashyap, Sanidhya
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2507.08505
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Efficient On-Device Agents via Adaptive Context Management
by: Vijayvargiya, Sanidhya, et al.
Published: (2025)

MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices
by: Wang, Zhaode, et al.
Published: (2025)

Efficient Deployment of Large Language Models on Resource-constrained Devices
by: Yao, Zhiwei, et al.
Published: (2025)

Training Machine Learning Models on Human Spatio-temporal Mobility Data: An Experimental Study [Experiment Paper]
by: Liu, Yueyang, et al.
Published: (2025)

AutoTailor: Automatic and Efficient Adaptive Model Deployment for Diverse Edge Devices
by: Liu, Mengyang, et al.
Published: (2025)

A Study on Inference Latency for Vision Transformers on Mobile Devices
by: Li, Zhuojin, et al.
Published: (2025)

Resource-Efficient Generative AI Model Deployment in Mobile Edge Networks
by: Liang, Yuxin, et al.
Published: (2024)

Understanding Large Language Models in Your Pockets: Performance Study on COTS Mobile Devices
by: Xiao, Jie, et al.
Published: (2024)

MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale Deployment
by: Huang, Hanxian, et al.
Published: (2026)

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
by: Liu, Zechun, et al.
Published: (2024)

TinyFormer: Efficient Transformer Design and Deployment on Tiny Devices
by: Yang, Jianlei, et al.
Published: (2023)

Interpretable Discovery of One-parameter Subgroups: A Modular Framework for Elliptical, Hyperbolic, and Parabolic Symmetries
by: Karjol, Pavan, et al.
Published: (2025)

Energy-Efficient Vision Transformer Inference for Edge-AI Deployment
by: Amanzhol, Nursultan, et al.
Published: (2025)

On-Device Vision Training, Deployment, and Inference on a Thumb-Sized Microcontroller
by: Ellis, Jeremy
Published: (2026)

Methodology to Deploy CNN-Based Computer Vision Models on Immersive Wearable Devices
by: Malek, Kaveh, et al.
Published: (2024)

Designing and Deploying AI Models for Sustainable Logistics Optimization: A Case Study on Eco-Efficient Supply Chains in the USA
by: Shawon, Reza E Rabbi, et al.
Published: (2025)

On-Demand Multi-Task Sparsity for Efficient Large-Model Deployment on Edge Devices
by: Huang, Lianming, et al.
Published: (2025)

MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices?
by: Zou, Xingze, et al.
Published: (2026)

FLoRA: Enhancing Vision-Language Models with Parameter-Efficient Federated Learning
by: Nguyen, Duy Phuong, et al.
Published: (2024)

Deploying Large AI Models on Resource-Limited Devices with Split Federated Learning
by: Qiang, Xianke, et al.
Published: (2025)

Bridging Embodiment Gaps: Deploying Vision-Language-Action Models on Soft Robots
by: Su, Haochen, et al.
Published: (2025)

TAP-ViTs: Task-Adaptive Pruning for On-Device Deployment of Vision Transformers
by: Wang, Zhibo, et al.
Published: (2026)

LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning
by: Guo, Han, et al.
Published: (2023)

Efficient Exact Resistance Distance Computation on Small-Treewidth Graphs: a Labelling Approach
by: Liao, Meihao, et al.
Published: (2025)

Memory-Efficient Backpropagation for Fine-Tuning LLMs on Resource-Constrained Mobile Devices
by: Song, Congzheng, et al.
Published: (2025)

Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers
by: Scherer, Moritz, et al.
Published: (2024)

PSK@EEUCA 2026: Fine-Tuning Large Language Models with Synthetic Data Augmentation for Multi-Class Toxicity Detection in Gaming Chat
by: Pulipaka, Srikar Kashyap
Published: (2026)

MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases
by: Murthy, Rithesh, et al.
Published: (2024)

Recurrent Memory-Augmented Transformers with Chunked Attention for Long-Context Language Modeling
by: Kashyap, Ankit
Published: (2025)

Breaking SafetyCore: Exploring the Risks of On-Device AI Deployment
by: Guyomard, Victor, et al.
Published: (2025)

SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
by: Song, Yixin, et al.
Published: (2025)

Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries
by: Yan, Tianyi Lorena, et al.
Published: (2025)

A Cost-Benefit Analysis of On-Premise Large Language Model Deployment: Breaking Even with Commercial LLM Services
by: Pan, Guanzhong, et al.
Published: (2025)

CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment
by: Guo, Siyuan, et al.
Published: (2026)

DTMM: Deploying TinyML Models on Extremely Weak IoT Devices with Pruning
by: Han, Lixiang, et al.
Published: (2024)

Fed MobiLLM: Efficient Federated LLM Fine-Tuning over Heterogeneous Mobile Devices via Server Assisted Side-Tuning
by: Yang, Xingke, et al.
Published: (2025)

Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search
by: Behdin, Kayhan, et al.
Published: (2025)

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
by: Fu, Yonggan, et al.
Published: (2024)

EdgeMoE: Empowering Sparse Large Language Models on Mobile Devices
by: Yi, Rongjie, et al.
Published: (2023)

Mixture of Cache-Conditional Experts for Efficient Mobile Device Inference
by: Skliar, Andrii, et al.
Published: (2024)