:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ngabonziza, Bernard, Banerjee, Ayan, Gupta, Sandeep K. S.
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Performance
Online Access:	https://arxiv.org/abs/2601.04545
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Detection of Deployment Operational Deviations for Safety and Security of AI-Enabled Human-Centric Cyber Physical Systems
by: Ngabonziza, Bernard, et al.
Published: (2026)

CPS-LLM: Large Language Model based Safe Usage Plan Generator for Human-in-the-Loop Human-in-the-Plant Cyber-Physical System
by: Banerjee, Ayan, et al.
Published: (2024)

Model Recovery at the Edge under Resource Constraints for Physical AI
by: Xu, Bin, et al.
Published: (2025)

Recovering implicit physics model under real-world constraints
by: Banerjee, Ayan, et al.
Published: (2024)

FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization system
by: Li, Zeyuan, et al.
Published: (2024)

Learning, Potential, and Retention: An Approach for Evaluating Adaptive AI-Enabled Medical Devices
by: Burgon, Alexis, et al.
Published: (2026)

Enabling Physical AI at the Edge: Hardware-Accelerated Recovery of System Dynamics
by: Xu, Bin, et al.
Published: (2025)

ShadowNPU: System and Algorithm Co-design for NPU-Centric On-Device LLM Inference
by: Yin, Wangsong, et al.
Published: (2025)

NEURO-GUARD: Neuro-Symbolic Generalization and Unbiased Adaptive Routing for Diagnostics -- Explainable Medical AI
by: Urooj, Midhat, et al.
Published: (2025)

PixLift: Accelerating Web Browsing via AI Upscaling
by: Atinafu, Yonas, et al.
Published: (2025)

XTC, A Research Platform for Optimizing AI Workload Operators
by: Hugo, Pompougnac, et al.
Published: (2025)

Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI
by: Pfister, Rolf, et al.
Published: (2025)

Information Retrieval in the Age of Generative AI: The RGB Model
by: Garetto, Michele, et al.
Published: (2025)

GPT-OSS-20B: A Comprehensive Deployment-Centric Analysis of OpenAI's Open-Weight Mixture of Experts Model
by: Kumar, Deepak, et al.
Published: (2025)

Impact of Data-Oriented and Object-Oriented Design on Performance and Cache Utilization with Artificial Intelligence Algorithms in Multi-Threaded CPUs
by: Arantes, Gabriel M., et al.
Published: (2025)

Improving LLM Performance Through Black-Box Online Tuning: A Case for Adding System Specs to Factsheets for Trusted AI
by: Atinafu, Yonas, et al.
Published: (2026)

When AI Bends Metal: AI-Assisted Optimization of Design Parameters in Sheet Metal Forming
by: Tarraf, Ahmad, et al.
Published: (2025)

AdaGradSelect: An adaptive gradient-guided layer selection method for efficient fine-tuning of SLMs
by: Kumar, Anshul, et al.
Published: (2025)

Reliability by design: quantifying and eliminating fabrication risk in LLMs. From generative to consultative AI: a comparative analysis in the legal domain and lessons for high-stakes knowledge bases
by: Dantart, Alex
Published: (2026)

Knowledge Grafting: A Mechanism for Optimizing AI Model Deployment in Resource-Constrained Environments
by: Almurshed, Osama, et al.
Published: (2025)

Accelerated Digital Twin Learning for Edge AI: A Comparison of FPGA and Mobile GPU
by: Xu, Bin, et al.
Published: (2025)

ALISE: Accelerating Large Language Model Serving with Speculative Scheduling
by: Zhao, Youpeng, et al.
Published: (2024)

SweetSpot: An Analytical Model for Predicting Energy Efficiency of LLM Inference
by: Cavagna, Hiari Pizzini, et al.
Published: (2026)

On the Sustainability of AI Inferences in the Edge
by: Sobhani, Ghazal, et al.
Published: (2025)

Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software
by: Yi, Lirong, et al.
Published: (2025)

Edge Deployment of Small Language Models, a comprehensive comparison of CPU, GPU and NPU backends
by: Prieto, Pablo, et al.
Published: (2025)

An Efficient Hybrid Sparse Attention with CPU-GPU Parallelism for Long-Context Inference
by: Yao, Feiyu, et al.
Published: (2026)

This Is Taking Too Long -- Investigating Time as a Proxy for Energy Consumption of LLMs
by: Krupp, Lars, et al.
Published: (2026)

Photonic Fabric Platform for AI Accelerators
by: Ding, Jing, et al.
Published: (2025)

Should AI Optimize Your Code? A Comparative Study of Classical Optimizing Compilers Versus Current Large Language Models
by: Rosas, Miguel Romero, et al.
Published: (2024)

LLMs for Analog Circuit Design Continuum (ACDC)
by: Esfandiari, Yasaman, et al.
Published: (2025)

Knowledge Distillation for Reservoir-based Classifier: Human Activity Recognition
by: Kagiyama, Masaharu, et al.
Published: (2025)

Characterizing VLA Models: Identifying the Action Generation Bottleneck for Edge AI Architectures
by: Vishwanathan, Manoj, et al.
Published: (2026)

Time-Efficient Hybrid Hyperparameter Tuning Approach for Cardiovascular Disease Classification
by: Pathak, Abhay Kumar, et al.
Published: (2024)

XAI-MeD: Explainable Knowledge Guided Neuro-Symbolic Framework for Domain Generalization and Rare Class Detection in Medical Imaging
by: Urooj, Midhat, et al.
Published: (2026)

Looking Forward: Challenges and Opportunities in Agentic AI Reliability
by: Xing, Liudong, et al.
Published: (2025)

LLM-Driven Design Space Exploration of FPGA-based Accelerators
by: Sharma, Vinamra, et al.
Published: (2026)

Revolutionizing System Reliability: The Role of AI in Predictive Maintenance Strategies
by: Bidollahkhani, Michael, et al.
Published: (2024)

The Race to Efficiency: A New Perspective on AI Scaling Laws
by: Lu, Chien-Ping
Published: (2025)

Pushing the Envelope of LLM Inference on AI-PC and Intel GPUs
by: Georganas, Evangelos, et al.
Published: (2025)