:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Marinoni, Andrea, Cambria, Erik, Lin, Weisi, Mura, Mauro Dalla, Chanussot, Jocelyn, Ragusa, Edoardo, Tso, Chi Yan, Zhu, Yihao, Horton, Benjamin
Format:	Preprint
Published:	2026
Subjects:	Computers and Society Artificial Intelligence Hardware Architecture
Online Access:	https://arxiv.org/abs/2603.20897
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Improving AI Efficiency in Data Centres by Power Dynamic Response
by: Marinoni, Andrea, et al.
Published: (2025)

Unlimited Vector Processing for Wireless Baseband Based on RISC-V Extension
by: Jiang, Limin, et al.
Published: (2025)

Scalable data concentrator with baseline interconnection network for triggerless data acquisition systems
by: Zabołotny, Wojciech M.
Published: (2023)

A High-Throughput FPGA Accelerator for Lightweight CNNs With Balanced Dataflow
by: Zhao, Zhiyuan, et al.
Published: (2024)

Scalable and Efficient Intra- and Inter-node Interconnection Networks for Post-Exascale Supercomputers and Data centers
by: Tarraga-Moreno, Joaquin, et al.
Published: (2025)

Predictive Software Scheduling as an Early-Warning Hint Layer for Optical Engine Thermal Drift in Heterogeneous SoIC Packaging
by: Chung, Chi Fei
Published: (2026)

RoboGPU: Accelerating GPU Collision Detection for Robotics
by: Liu, Lufei, et al.
Published: (2026)

A modular architecture for IMU-based data gloves
by: Carfì, Alessandro, et al.
Published: (2024)

Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI Acceleration
by: Taka, Endri, et al.
Published: (2025)

Coliseum project: Correlating climate change data with the behavior of heritage materials
by: Cormier, A, et al.
Published: (2025)

Characterizing the impact of last-level cache replacement policies on big-data workloads
by: Jamet, Alexandre Valentin, et al.
Published: (2023)

FlatAttention: Dataflow and Fabric Collectives Co-Optimization for Large Attention-Based Model Inference on Tile-Based Accelerators
by: Zhang, Chi, et al.
Published: (2026)

Circuits and Systems for Embodied AI: Exploring uJ Multi-Modal Perception for Nano-UAVs on the Kraken Shield
by: Potocnik, Viviane, et al.
Published: (2024)

UpANNS: Enhancing Billion-Scale ANNS Efficiency with Real-World PIM Architecture
by: Chen, Sitian, et al.
Published: (2024)

A comprehensive study on ILP acceleration accounting for sparsity, area, energy, data movement using near-memory architecture
by: Raman, Siddhartha Raman Sundara, et al.
Published: (2026)

Co-Designing Graph-based Approximate Nearest Neighbor Search at Billion Scale for Processing-in-Memory
by: Chen, Sitian, et al.
Published: (2026)

Balancing FP8 Computation Accuracy and Efficiency on Digital CIM via Shift-Aware On-the-fly Aligned-Mantissa Bitwidth Prediction
by: Zhao, Liang, et al.
Published: (2026)

Scaling up Reversible Logic with HKI Superconducting Inductors
by: DeBenedictis, Erik P.
Published: (2025)

CODO: An Automated Compiler for Comprehensive Dataflow Optimization
by: Zhang, Weichuang, et al.
Published: (2026)

FlatAttention: Dataflow and Fabric Collectives Co-Optimization for Efficient Multi-Head Attention on Tile-Based Many-PE Accelerators
by: Zhang, Chi, et al.
Published: (2025)

Zoozve: A Strip-Mining-Free RISC-V Vector Extension with Arbitrary Register Grouping Compilation Support (WIP)
by: Xu, Siyi, et al.
Published: (2025)

A Memory-Efficient Retrieval Architecture for RAG-Enabled Wearable Medical LLMs-Agents
by: Liao, Zhipeng, et al.
Published: (2025)

Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality with At-MRAM Neural Engine
by: Prasad, Arpan Suravi, et al.
Published: (2023)

Using GUI Agent for Electronic Design Automation
by: Li, Chunyi, et al.
Published: (2025)

HomeLabGym: A real-world testbed for home energy management systems
by: Van Puyvelde, Toon, et al.
Published: (2024)

DIRC-RAG: Accelerating Edge RAG with Robust High-Density and High-Loading-Bandwidth Digital In-ReRAM Computation
by: Shao, Kunming, et al.
Published: (2025)

A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing
by: Jiang, Limin, et al.
Published: (2024)

SynDCIM: A Performance-Aware Digital Computing-in-Memory Compiler with Multi-Spec-Oriented Subcircuit Synthesis
by: Shao, Kunming, et al.
Published: (2024)

CAMformer: Associative Memory is All You Need
by: Molom-Ochir, Tergel, et al.
Published: (2025)

31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
by: Dong, Pingcheng, et al.
Published: (2026)

Efficient Implementation of LinearUCB through Algorithmic Improvements and Vector Computing Acceleration for Embedded Learning Systems
by: Angioli, Marco, et al.
Published: (2025)

Improving Injection-Throttling Mechanisms for Congestion Control for Data-center and Supercomputer Interconnects
by: Olmedilla, Cristina, et al.
Published: (2025)

Low-latency D-MIMO Localization using Distributed Scalable Message-Passing Algorithm
by: Iancu, Dumitra, et al.
Published: (2025)

Effective and Memory-Efficient Alternatives to ECC for Reliable Large-Scale DNNs
by: Ahmadilivani, Mohammad Hasan, et al.
Published: (2026)

PDA-LSTM: Knowledge-driven page data arrangement based on LSTM for LCM supression in QLC 3D NAND flash memories
by: Li, Qianhui, et al.
Published: (2025)

A Flexible Precision Scaling Deep Neural Network Accelerator with Efficient Weight Combination
by: Zhao, Liang, et al.
Published: (2025)

NL-DPE: An Analog In-memory Non-Linear Dot Product Engine for Efficient CNN and LLM Inference
by: Zhao, Lei, et al.
Published: (2025)

Flexible In-NAND Cryptographic Processing for Secure Flash Storage
by: Noh, Seock-Hwan, et al.
Published: (2025)

RAS: A Bit-Exact rANS Accelerator For High-Performance Neural Lossless Compression
by: Qin, Yuchao, et al.
Published: (2025)

A Multicast-Capable AXI Crossbar for Many-core Machine Learning Accelerators
by: Colagrande, Luca, et al.
Published: (2025)