:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zheng, Yusheng, Mao, Wenan, Cheng, Shuyi, Feng, Fuqiu, Li, Guangshui, Liao, Zhaoyan, Huang, Yongzhuo, Xiao, Zhenwei, Li, Yuqing, Quinn, Andi, Ma, Tao
Format:	Preprint
Published:	2026
Subjects:	Performance
Online Access:	https://arxiv.org/abs/2603.29235
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CXLMemSim: A pure software simulated CXL.mem for performance characterization
by: Yang, Yiwei, et al.
Published: (2023)

SysLLMatic: Large Language Models are Software System Optimizers
by: Peng, Huiyun, et al.
Published: (2025)

Gem5-AcceSys: Enabling System-Level Exploration of Standard Interconnects for Novel Accelerators
by: Liu, Qunyou, et al.
Published: (2025)

AI Load Dynamics--A Power Electronics Perspective
by: Li, Yuzhuo, et al.
Published: (2025)

Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training
by: Liu, Vivian, et al.
Published: (2024)

Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm
by: Wang, Yuqing, et al.
Published: (2025)

Mosaic: Cross-Modal Clustering for Efficient Video Understanding
by: Wang, Tuowei, et al.
Published: (2026)

The Price of Interoperability: Exploring Cross-Chain Bridges and Their Economic Consequences
by: Cao, Yiyue, et al.
Published: (2026)

AI Application Benchmarking: Power-Aware Performance Analysis for Vision and Language Models
by: Mayr, Martin, et al.
Published: (2026)

Impact of Generative AI (Large Language Models) on the PRA model construction and maintenance, observations
by: Rychkov, Valentin, et al.
Published: (2024)

A Latency-Constrained, Gated Recurrent Unit (GRU) Implementation in the Versal AI Engine
by: Sapkas, M., et al.
Published: (2025)

Modeling the Impact of Fiber Latency on Compute-Communication Overlap in Geo-Distributed Multi-Datacenter AI Training
by: Papavasileiou, Ioannis, et al.
Published: (2026)

Hardware optimization on Android for inference of AI models
by: Gherasim, Iulius, et al.
Published: (2025)

The Unseen AI Disruptions for Power Grids: LLM-Induced Transients
by: Li, Yuzhuo, et al.
Published: (2024)

AI Work Quantization Model: Closed-System AI Computational Effort Metric
by: Sharma, Aasish Kumar, et al.
Published: (2025)

Impact of AI-Triage on Radiologist Report Turnaround Time: Real-World Time-Savings and Insights from Model Predictions
by: Thompson, Yee Lam Elim, et al.
Published: (2025)

PixLift: Accelerating Web Browsing via AI Upscaling
by: Atinafu, Yonas, et al.
Published: (2025)

XTC, A Research Platform for Optimizing AI Workload Operators
by: Hugo, Pompougnac, et al.
Published: (2025)

On the Sustainability of AI Inferences in the Edge
by: Sobhani, Ghazal, et al.
Published: (2025)

Optimas: An Intelligent Analytics-Informed Generative AI Framework for Performance Optimization
by: Zaeed, Mohammad, et al.
Published: (2026)

Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI
by: Pfister, Rolf, et al.
Published: (2025)

EARL: Energy-Aware Optimization of Liquid State Machines for Pervasive AI
by: Iqbal, Zain, et al.
Published: (2026)

SPEC CPU2026: Characterization, Representativeness, and Cross-Suite Comparison
by: Li, Ruihao, et al.
Published: (2026)

DeepContext: A Context-aware, Cross-platform, and Cross-framework Tool for Performance Profiling and Analysis of Deep Learning Workloads
by: Zhao, Qidong, et al.
Published: (2024)

Photonic Fabric Platform for AI Accelerators
by: Ding, Jing, et al.
Published: (2025)

Learning, Potential, and Retention: An Approach for Evaluating Adaptive AI-Enabled Medical Devices
by: Burgon, Alexis, et al.
Published: (2026)

Hardware-efficient tractable probabilistic inference for TinyML Neurosymbolic AI applications
by: Leslin, Jelin, et al.
Published: (2025)

AgentSight: System-Level Observability for AI Agents Using eBPF
by: Zheng, Yusheng, et al.
Published: (2025)

Profiling Apple Silicon Performance for ML Training
by: Feng, Dahua, et al.
Published: (2025)

Scaler: Efficient and Effective Cross Flow Analysis
by: Steven, et al.
Published: (2024)

Leveraging AI for Productive and Trustworthy HPC Software: Challenges and Research Directions
by: Teranishi, Keita, et al.
Published: (2025)

GROOT: General-Purpose Automatic Parameter Tuning Across Layers, Domains, and Use Cases
by: Krahn, Robert, et al.
Published: (2025)

On Cross-Layer Interactions of QUIC, Encrypted DNS and HTTP/3: Design, Evaluation and Dataset
by: Sengupta, Jayasree, et al.
Published: (2023)

Accuracy and Consumption analysis from a compressed model by CompactifAI from Multiverse Computing
by: Fovet, Damien, et al.
Published: (2025)

Personalized Model-Based Design of Human Centric AI enabled CPS for Long term usage
by: Ngabonziza, Bernard, et al.
Published: (2026)

Looking Forward: Challenges and Opportunities in Agentic AI Reliability
by: Xing, Liudong, et al.
Published: (2025)

Information Retrieval in the Age of Generative AI: The RGB Model
by: Garetto, Michele, et al.
Published: (2025)

A Continuous Benchmarking Infrastructure for High-Performance Computing Applications
by: Alt, Christoph, et al.
Published: (2024)

Iterative Layer Pruning for Efficient Translation Inference
by: Moslem, Yasmin, et al.
Published: (2025)

SparseInfer: Training-free Prediction of Activation Sparsity for Fast LLM Inference
by: Shin, Jiho, et al.
Published: (2024)