:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Mingbin, Jin, Alex, Wang, Sicheng, Su, Mu, Ng, Tim, Mason, Henry, Han, Shiyi, Lei, Zhihong, Deng, Yaqiao, Huang, Zhen, Krishnamoorthy, Mahesh
Format:	Preprint
Published:	2023
Subjects:	Machine Learning Performance
Online Access:	https://arxiv.org/abs/2312.10359
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Spatiotemporal Analysis of Parallelized Computing at the Extreme Edge
by: Nabil, Yasser, et al.
Published: (2025)

Automating Energy-Efficient GPU Kernel Generation: A Fast Search-Based Compilation Approach
by: Zhang, Yijia, et al.
Published: (2024)

Performance Characterization of Containers in Edge Computing
by: Gupta, Ragini, et al.
Published: (2025)

Spatiotemporal Non-Uniformity-Aware Online Task Scheduling in Collaborative Edge Computing for Industrial Internet of Things
by: Li, Yang, et al.
Published: (2025)

Modeling Tradeoffs between mobility, cost, and performance in Edge Computing
by: Waseem, Muhammad Danish, et al.
Published: (2026)

Latency and Privacy-Aware Resource Allocation in Vehicular Edge Computing
by: Ahmadvand, Hossein, et al.
Published: (2025)

FlexQuant: Elastic Quantization Framework for Locally Hosted LLM on Edge Devices
by: Chai, Yuji, et al.
Published: (2025)

CarbonCP: Carbon-Aware DNN Partitioning with Conformal Prediction for Sustainable Edge Intelligence
by: Ke, Hongyu, et al.
Published: (2024)

Enhancing CTC-based speech recognition with diverse modeling units
by: Han, Shiyi, et al.
Published: (2024)

Resource-Efficient RGB-Only Action Recognition for Edge Deployment
by: Yoon, Dongsik, et al.
Published: (2026)

A Structure-Aware Framework for Learning Device Placements on Computation Graphs
by: Duan, Shukai, et al.
Published: (2024)

Characterizing and Optimizing Realistic Workloads on a Commercial Compute-in-SRAM Device
by: Zhang, Niansong, et al.
Published: (2025)

HPC Application Parameter Autotuning on Edge Devices: A Bandit Learning Approach
by: Hossain, Abrar, et al.
Published: (2025)

CarbonCall: Sustainability-Aware Function Calling for Large Language Models on Edge Devices
by: Paramanayakam, Varatheepan, et al.
Published: (2025)

RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis
by: Bi, Zhen, et al.
Published: (2026)

Memory Analysis on the Training Course of DeepSeek Models
by: Zhang, Ping, et al.
Published: (2025)

ISO: Overlap of Computation and Communication within Seqenence For LLM Inference
by: Xiao, Bin, et al.
Published: (2024)

Application Research On Real-Time Perception Of Device Performance Status
by: Wang, Zhe, et al.
Published: (2024)

CoFormer: Collaborating with Heterogeneous Edge Devices for Scalable Transformer Inference
by: Xu, Guanyu, et al.
Published: (2025)

Less is More: Optimizing Function Calling for LLM Execution on Edge Devices
by: Paramanayakam, Varatheepan, et al.
Published: (2024)

Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs
by: Ng, Nathan, et al.
Published: (2026)

Modeling Interfering Sources in Shared Queues for Timely Computations in Edge Computing Systems
by: Akar, Nail, et al.
Published: (2024)

Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm
by: Wang, Yuqing, et al.
Published: (2025)

DEER: Deep Runahead for Instruction Prefetching on Modern Mobile Workloads
by: Vahdatniya, Parmida, et al.
Published: (2025)

Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices
by: Sedlak, Boris, et al.
Published: (2025)

MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache
by: Xue, Leyang, et al.
Published: (2024)

Characterize LSM-tree Compaction Performance via On-Device LLM Inference
by: Ding, Jiabiao, et al.
Published: (2026)

AFarePart: Accuracy-aware Fault-resilient Partitioner for DNN Edge Accelerators
by: Debnath, Mukta, et al.
Published: (2025)

Resource Management Schemes for Cloud-Native Platforms with Computing Containers of Docker and Kubernetes
by: Mao, Ying, et al.
Published: (2020)

Computational Complexity-Constrained Spectral Efficiency Analysis for 6G Waveforms
by: Queiroz, Saulo, et al.
Published: (2024)

CXL-Interference: Analysis and Characterization in Modern Computer Systems
by: Mao, Shunyu, et al.
Published: (2024)

XRFlux: Virtual Reality Benchmark for Edge Caching Systems
by: Alfares, Nader, et al.
Published: (2024)

Ecoscape: Fault Tolerance Benchmark for Adaptive Remediation Strategies in Real-Time Edge ML
by: Reiter, Hendrik, et al.
Published: (2025)

Achieving Consistent and Comparable CPU Evaluation
by: Wang, Chenxi, et al.
Published: (2024)

Attributing the System's Overall Effect to its Components
by: Wang, Chenxi, et al.
Published: (2026)

Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning
by: Su, Qisheng, et al.
Published: (2026)

Unikernels vs. Containers: A Runtime-Level Performance Comparison for Resource-Constrained Edge Workloads
by: Dinh-Tuan, Hai
Published: (2025)

Two-Timescale Dynamic Service Deployment and Task Scheduling with Spatiotemporal Collaboration in Mobile Edge Networks
by: Li, Yang, et al.
Published: (2025)

Redundant Array Computation Elimination
by: Wang, Zixuan, et al.
Published: (2025)

Performance of Confidential Computing GPUs
by: Ibarra, Antonio Martínez, et al.
Published: (2025)