:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fovet, Damien, Chamoli, Shashank, Oury, Sarah, Singhal, Srishti
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Performance
Online Access:	https://arxiv.org/abs/2507.08836
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Cloud Computing Energy Consumption Prediction Based on Kernel Extreme Learning Machine Algorithm Improved by Vector Weighted Average Algorithm
by: Wang, Yuqing, et al.
Published: (2025)

CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks
by: Tomut, Andrei, et al.
Published: (2024)

Hardware optimization on Android for inference of AI models
by: Gherasim, Iulius, et al.
Published: (2025)

Reducing Compute Waste in LLMs through Kernel-Level DVFS
by: Spaan, Jeffrey, et al.
Published: (2026)

Conformer-Based Speech Recognition On Extreme Edge-Computing Devices
by: Xu, Mingbin, et al.
Published: (2023)

A Structure-Aware Framework for Learning Device Placements on Computation Graphs
by: Duan, Shukai, et al.
Published: (2024)

Towards Computational Performance Engineering for Unsupervised Concept Drift Detection -- Complexities, Benchmarking, Performance Analysis
by: Werner, Elias, et al.
Published: (2023)

EARL: Energy-Aware Optimization of Liquid State Machines for Pervasive AI
by: Iqbal, Zain, et al.
Published: (2026)

Hardware-efficient tractable probabilistic inference for TinyML Neurosymbolic AI applications
by: Leslin, Jelin, et al.
Published: (2025)

DistZO2: High-Throughput and Memory-Efficient Zeroth-Order Fine-tuning LLMs with Distributed Parallel Computing
by: Wang, Liangyu, et al.
Published: (2025)

Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework
by: Estevez, Melissa, et al.
Published: (2025)

Throughput Optimization as a Strategic Lever in Large-Scale AI Systems: Evidence from Dataloader and Memory Profiling Innovations
by: Jha, Mayank
Published: (2026)

MicroHD: An Accuracy-Driven Optimization of Hyperdimensional Computing Algorithms for TinyML systems
by: Ponzina, Flavio, et al.
Published: (2024)

Who Wins the Race? (R Vs Python) - An Exploratory Study on Energy Consumption of Machine Learning Algorithms
by: Chattaraj, Rajrupa, et al.
Published: (2025)

IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency
by: Ghafouri, Saeid, et al.
Published: (2023)

Towards A Flexible Accuracy-Oriented Deep Learning Module Inference Latency Prediction Framework for Adaptive Optimization Algorithms
by: Shen, Jingran, et al.
Published: (2023)

On the Sustainability of AI Inferences in the Edge
by: Sobhani, Ghazal, et al.
Published: (2025)

PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU
by: McInroe, Trevor, et al.
Published: (2025)

Efficient Graph Knowledge Distillation from GNNs to Kolmogorov--Arnold Networks via Self-Attention Dynamic Sampling
by: Cui, Can, et al.
Published: (2025)

Optimizing Methane Detection On Board Satellites: Speed, Accuracy, and Low-Power Solutions for Resource-Constrained Hardware
by: Herec, Jonáš, et al.
Published: (2025)

The Race to Efficiency: A New Perspective on AI Scaling Laws
by: Lu, Chien-Ping
Published: (2025)

Pushing the Envelope of LLM Inference on AI-PC and Intel GPUs
by: Georganas, Evangelos, et al.
Published: (2025)

DaCe AD: Unifying High-Performance Automatic Differentiation for Machine Learning and Scientific Computing
by: Boudaoud, Afif, et al.
Published: (2025)

Parallel Implementations Assessment of a Spatial-Spectral Classifier for Hyperspectral Clinical Applications
by: Lazcano, Raquel, et al.
Published: (2024)

A High-Throughput Compute-Efficient POMDP Hide-And-Seek-Engine (HASE) for Multi-Agent Operations
by: Flavin, Timothy, et al.
Published: (2026)

MoEITS: A Green AI approach for simplifying MoE-LLMs
by: Balderas, Luis, et al.
Published: (2026)

Knowledge Grafting: A Mechanism for Optimizing AI Model Deployment in Resource-Constrained Environments
by: Almurshed, Osama, et al.
Published: (2025)

Energy per Successful Goal: Goal-Level Energy Accounting for Agentic AI Systems
by: Panigrahy, Deepak, et al.
Published: (2026)

Leveraging Speculative Sampling and KV-Cache Optimizations Together for Generative AI using OpenVINO
by: Barad, Haim, et al.
Published: (2023)

Design Space Exploration of Approximate Computing Techniques with a Reinforcement Learning Approach
by: Saeedi, Sepide, et al.
Published: (2023)

CPINN-ABPI: Physics-Informed Neural Networks for Accurate Power Estimation in MPSoCs
by: Elshamy, Mohamed R., et al.
Published: (2025)

PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
by: An, Zihao, et al.
Published: (2025)

Flashlight: PyTorch Compiler Extensions to Accelerate Attention Variants
by: You, Bozhi, et al.
Published: (2025)

Enhancing Tropical Cyclone Path Forecasting with an Improved Transformer Network
by: Van Thanh, Nguyen, et al.
Published: (2025)

lm-Meter: Unveiling Runtime Inference Latency for On-Device Language Models
by: Wang, Haoxin, et al.
Published: (2025)

WCDT: Systematic WCET Optimization for Decision Tree Implementations
by: Hölscher, Nils, et al.
Published: (2025)

MoE-Inference-Bench: Performance Evaluation of Mixture of Expert Large Language and Vision Models
by: Chitty-Venkata, Krishna Teja, et al.
Published: (2025)

MLPerf Automotive
by: Shojaei, Radoyeh, et al.
Published: (2025)

Light Differentiable Logic Gate Networks
by: Rüttgers, Lukas, et al.
Published: (2025)

V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms
by: Rodrigo, Javier J. Poveda, et al.
Published: (2025)