Saved in:
| Main Authors: | Marinoni, Andrea, Cambria, Erik, Lin, Weisi, Mura, Mauro Dalla, Chanussot, Jocelyn, Ragusa, Edoardo, Tso, Chi Yan, Zhu, Yihao, Horton, Benjamin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.20897 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving AI Efficiency in Data Centres by Power Dynamic Response
by: Marinoni, Andrea, et al.
Published: (2025)
by: Marinoni, Andrea, et al.
Published: (2025)
Unlimited Vector Processing for Wireless Baseband Based on RISC-V Extension
by: Jiang, Limin, et al.
Published: (2025)
by: Jiang, Limin, et al.
Published: (2025)
Scalable data concentrator with baseline interconnection network for triggerless data acquisition systems
by: Zabołotny, Wojciech M.
Published: (2023)
by: Zabołotny, Wojciech M.
Published: (2023)
A High-Throughput FPGA Accelerator for Lightweight CNNs With Balanced Dataflow
by: Zhao, Zhiyuan, et al.
Published: (2024)
by: Zhao, Zhiyuan, et al.
Published: (2024)
Scalable and Efficient Intra- and Inter-node Interconnection Networks for Post-Exascale Supercomputers and Data centers
by: Tarraga-Moreno, Joaquin, et al.
Published: (2025)
by: Tarraga-Moreno, Joaquin, et al.
Published: (2025)
Predictive Software Scheduling as an Early-Warning Hint Layer for Optical Engine Thermal Drift in Heterogeneous SoIC Packaging
by: Chung, Chi Fei
Published: (2026)
by: Chung, Chi Fei
Published: (2026)
RoboGPU: Accelerating GPU Collision Detection for Robotics
by: Liu, Lufei, et al.
Published: (2026)
by: Liu, Lufei, et al.
Published: (2026)
A modular architecture for IMU-based data gloves
by: Carfì, Alessandro, et al.
Published: (2024)
by: Carfì, Alessandro, et al.
Published: (2024)
Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI Acceleration
by: Taka, Endri, et al.
Published: (2025)
by: Taka, Endri, et al.
Published: (2025)
Coliseum project: Correlating climate change data with the behavior of heritage materials
by: Cormier, A, et al.
Published: (2025)
by: Cormier, A, et al.
Published: (2025)
Characterizing the impact of last-level cache replacement policies on big-data workloads
by: Jamet, Alexandre Valentin, et al.
Published: (2023)
by: Jamet, Alexandre Valentin, et al.
Published: (2023)
FlatAttention: Dataflow and Fabric Collectives Co-Optimization for Large Attention-Based Model Inference on Tile-Based Accelerators
by: Zhang, Chi, et al.
Published: (2026)
by: Zhang, Chi, et al.
Published: (2026)
Circuits and Systems for Embodied AI: Exploring uJ Multi-Modal Perception for Nano-UAVs on the Kraken Shield
by: Potocnik, Viviane, et al.
Published: (2024)
by: Potocnik, Viviane, et al.
Published: (2024)
UpANNS: Enhancing Billion-Scale ANNS Efficiency with Real-World PIM Architecture
by: Chen, Sitian, et al.
Published: (2024)
by: Chen, Sitian, et al.
Published: (2024)
A comprehensive study on ILP acceleration accounting for sparsity, area, energy, data movement using near-memory architecture
by: Raman, Siddhartha Raman Sundara, et al.
Published: (2026)
by: Raman, Siddhartha Raman Sundara, et al.
Published: (2026)
Co-Designing Graph-based Approximate Nearest Neighbor Search at Billion Scale for Processing-in-Memory
by: Chen, Sitian, et al.
Published: (2026)
by: Chen, Sitian, et al.
Published: (2026)
Balancing FP8 Computation Accuracy and Efficiency on Digital CIM via Shift-Aware On-the-fly Aligned-Mantissa Bitwidth Prediction
by: Zhao, Liang, et al.
Published: (2026)
by: Zhao, Liang, et al.
Published: (2026)
Scaling up Reversible Logic with HKI Superconducting Inductors
by: DeBenedictis, Erik P.
Published: (2025)
by: DeBenedictis, Erik P.
Published: (2025)
CODO: An Automated Compiler for Comprehensive Dataflow Optimization
by: Zhang, Weichuang, et al.
Published: (2026)
by: Zhang, Weichuang, et al.
Published: (2026)
FlatAttention: Dataflow and Fabric Collectives Co-Optimization for Efficient Multi-Head Attention on Tile-Based Many-PE Accelerators
by: Zhang, Chi, et al.
Published: (2025)
by: Zhang, Chi, et al.
Published: (2025)
Zoozve: A Strip-Mining-Free RISC-V Vector Extension with Arbitrary Register Grouping Compilation Support (WIP)
by: Xu, Siyi, et al.
Published: (2025)
by: Xu, Siyi, et al.
Published: (2025)
A Memory-Efficient Retrieval Architecture for RAG-Enabled Wearable Medical LLMs-Agents
by: Liao, Zhipeng, et al.
Published: (2025)
by: Liao, Zhipeng, et al.
Published: (2025)
Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality with At-MRAM Neural Engine
by: Prasad, Arpan Suravi, et al.
Published: (2023)
by: Prasad, Arpan Suravi, et al.
Published: (2023)
Using GUI Agent for Electronic Design Automation
by: Li, Chunyi, et al.
Published: (2025)
by: Li, Chunyi, et al.
Published: (2025)
HomeLabGym: A real-world testbed for home energy management systems
by: Van Puyvelde, Toon, et al.
Published: (2024)
by: Van Puyvelde, Toon, et al.
Published: (2024)
DIRC-RAG: Accelerating Edge RAG with Robust High-Density and High-Loading-Bandwidth Digital In-ReRAM Computation
by: Shao, Kunming, et al.
Published: (2025)
by: Shao, Kunming, et al.
Published: (2025)
A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing
by: Jiang, Limin, et al.
Published: (2024)
by: Jiang, Limin, et al.
Published: (2024)
SynDCIM: A Performance-Aware Digital Computing-in-Memory Compiler with Multi-Spec-Oriented Subcircuit Synthesis
by: Shao, Kunming, et al.
Published: (2024)
by: Shao, Kunming, et al.
Published: (2024)
CAMformer: Associative Memory is All You Need
by: Molom-Ochir, Tergel, et al.
Published: (2025)
by: Molom-Ochir, Tergel, et al.
Published: (2025)
31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
by: Dong, Pingcheng, et al.
Published: (2026)
by: Dong, Pingcheng, et al.
Published: (2026)
Efficient Implementation of LinearUCB through Algorithmic Improvements and Vector Computing Acceleration for Embedded Learning Systems
by: Angioli, Marco, et al.
Published: (2025)
by: Angioli, Marco, et al.
Published: (2025)
Improving Injection-Throttling Mechanisms for Congestion Control for Data-center and Supercomputer Interconnects
by: Olmedilla, Cristina, et al.
Published: (2025)
by: Olmedilla, Cristina, et al.
Published: (2025)
Low-latency D-MIMO Localization using Distributed Scalable Message-Passing Algorithm
by: Iancu, Dumitra, et al.
Published: (2025)
by: Iancu, Dumitra, et al.
Published: (2025)
Effective and Memory-Efficient Alternatives to ECC for Reliable Large-Scale DNNs
by: Ahmadilivani, Mohammad Hasan, et al.
Published: (2026)
by: Ahmadilivani, Mohammad Hasan, et al.
Published: (2026)
PDA-LSTM: Knowledge-driven page data arrangement based on LSTM for LCM supression in QLC 3D NAND flash memories
by: Li, Qianhui, et al.
Published: (2025)
by: Li, Qianhui, et al.
Published: (2025)
A Flexible Precision Scaling Deep Neural Network Accelerator with Efficient Weight Combination
by: Zhao, Liang, et al.
Published: (2025)
by: Zhao, Liang, et al.
Published: (2025)
NL-DPE: An Analog In-memory Non-Linear Dot Product Engine for Efficient CNN and LLM Inference
by: Zhao, Lei, et al.
Published: (2025)
by: Zhao, Lei, et al.
Published: (2025)
Flexible In-NAND Cryptographic Processing for Secure Flash Storage
by: Noh, Seock-Hwan, et al.
Published: (2025)
by: Noh, Seock-Hwan, et al.
Published: (2025)
RAS: A Bit-Exact rANS Accelerator For High-Performance Neural Lossless Compression
by: Qin, Yuchao, et al.
Published: (2025)
by: Qin, Yuchao, et al.
Published: (2025)
A Multicast-Capable AXI Crossbar for Many-core Machine Learning Accelerators
by: Colagrande, Luca, et al.
Published: (2025)
by: Colagrande, Luca, et al.
Published: (2025)
Similar Items
-
Improving AI Efficiency in Data Centres by Power Dynamic Response
by: Marinoni, Andrea, et al.
Published: (2025) -
Unlimited Vector Processing for Wireless Baseband Based on RISC-V Extension
by: Jiang, Limin, et al.
Published: (2025) -
Scalable data concentrator with baseline interconnection network for triggerless data acquisition systems
by: Zabołotny, Wojciech M.
Published: (2023) -
A High-Throughput FPGA Accelerator for Lightweight CNNs With Balanced Dataflow
by: Zhao, Zhiyuan, et al.
Published: (2024) -
Scalable and Efficient Intra- and Inter-node Interconnection Networks for Post-Exascale Supercomputers and Data centers
by: Tarraga-Moreno, Joaquin, et al.
Published: (2025)