Saved in:
| Main Authors: | S, Karthik Somayaji N., Li, Peng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.02764 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LLM-VeriPPA: Power, Performance, and Area Optimization aware Verilog Code Generation with Large Language Models
by: Thorat, Kiran, et al.
Published: (2025)
by: Thorat, Kiran, et al.
Published: (2025)
LLM-based AI Agent for Sizing of Analog and Mixed Signal Circuit
by: Liu, Chang, et al.
Published: (2025)
by: Liu, Chang, et al.
Published: (2025)
GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors
by: Zhang, Chengming, et al.
Published: (2024)
by: Zhang, Chengming, et al.
Published: (2024)
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation
by: Zhang, Zixi, et al.
Published: (2023)
by: Zhang, Zixi, et al.
Published: (2023)
EEsizer: LLM-Based AI Agent for Sizing of Analog and Mixed Signal Circuit
by: Liu, Chang, et al.
Published: (2025)
by: Liu, Chang, et al.
Published: (2025)
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
by: Li, Jinhao, et al.
Published: (2024)
by: Li, Jinhao, et al.
Published: (2024)
PASCAL: A Phase-Aware Scheduling Algorithm for Serving Reasoning-based Large Language Models
by: Cho, Eunyeong, et al.
Published: (2026)
by: Cho, Eunyeong, et al.
Published: (2026)
Intelligent4DSE: Optimizing High-Level Synthesis Design Space Exploration with Graph Neural Networks and Large Language Models
by: Xu, Lei, et al.
Published: (2025)
by: Xu, Lei, et al.
Published: (2025)
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
by: Fu, Yonggan, et al.
Published: (2023)
by: Fu, Yonggan, et al.
Published: (2023)
Learning-driven Physically-aware Large-scale Circuit Gate Sizing
by: Ye, Yuyang, et al.
Published: (2024)
by: Ye, Yuyang, et al.
Published: (2024)
AI Accelerators for Large Language Model Inference: Architecture Analysis and Scaling Strategies
by: Sharma, Amit
Published: (2025)
by: Sharma, Amit
Published: (2025)
MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving
by: Lee, Jungi, et al.
Published: (2025)
by: Lee, Jungi, et al.
Published: (2025)
SNIP: An Adaptive Mixed Precision Framework for Subbyte Large Language Model Training
by: Pan, Yunjie, et al.
Published: (2026)
by: Pan, Yunjie, et al.
Published: (2026)
Accelerating Neural Networks for Large Language Models and Graph Processing with Silicon Photonics
by: Afifi, Salma, et al.
Published: (2024)
by: Afifi, Salma, et al.
Published: (2024)
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization
by: Lee, Jungi, et al.
Published: (2024)
by: Lee, Jungi, et al.
Published: (2024)
CHIME: Chiplet-based Heterogeneous Near-Memory Acceleration for Edge Multimodal LLM Inference
by: Chen, Yanru, et al.
Published: (2025)
by: Chen, Yanru, et al.
Published: (2025)
Memory Access Characterization of Large Language Models in CPU Environment and its Potential Impacts
by: Banasik, Spencer
Published: (2025)
by: Banasik, Spencer
Published: (2025)
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving
by: Kim, Wonung, et al.
Published: (2025)
by: Kim, Wonung, et al.
Published: (2025)
A Hardware-Aware, Per-Layer Methodology for Post-Training Quantization of Large Language Models
by: Killian, Earl
Published: (2026)
by: Killian, Earl
Published: (2026)
AnaFlow: Agentic LLM-based Workflow for Reasoning-Driven Explainable and Sample-Efficient Analog Circuit Sizing
by: Ahmadzadeh, Mohsen, et al.
Published: (2025)
by: Ahmadzadeh, Mohsen, et al.
Published: (2025)
Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference
by: Wolters, Christopher, et al.
Published: (2024)
by: Wolters, Christopher, et al.
Published: (2024)
Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching
by: Yun, Sungmin, et al.
Published: (2024)
by: Yun, Sungmin, et al.
Published: (2024)
Hybrid JIT-CUDA Graph Optimization for Low-Latency Large Language Model Inference
by: Yadav, Divakar Kumar, et al.
Published: (2026)
by: Yadav, Divakar Kumar, et al.
Published: (2026)
Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models
by: Vahdatpour, Mohammad Saleh, et al.
Published: (2026)
by: Vahdatpour, Mohammad Saleh, et al.
Published: (2026)
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
by: Jang, Hongsun, et al.
Published: (2024)
by: Jang, Hongsun, et al.
Published: (2024)
Comprehensive Verilog Design Problems: A Next-Generation Benchmark Dataset for Evaluating Large Language Models and Agents on RTL Design and Verification
by: Pinckney, Nathaniel, et al.
Published: (2025)
by: Pinckney, Nathaniel, et al.
Published: (2025)
An FPGA-Based Accelerator Enabling Efficient Support for CNNs with Arbitrary Kernel Sizes
by: Wang, Miaoxin, et al.
Published: (2024)
by: Wang, Miaoxin, et al.
Published: (2024)
Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM
by: Li, Bingbing, et al.
Published: (2024)
by: Li, Bingbing, et al.
Published: (2024)
AutoPPA: Automated Circuit PPA Optimization via Contrastive Code-based Rule Library Learning
by: Li, Chongxiao, et al.
Published: (2026)
by: Li, Chongxiao, et al.
Published: (2026)
Hardware Software Optimizations for Fast Model Recovery on Reconfigurable Architectures
by: Xu, Bin, et al.
Published: (2025)
by: Xu, Bin, et al.
Published: (2025)
On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration
by: Xiang, Maoyang, et al.
Published: (2025)
by: Xiang, Maoyang, et al.
Published: (2025)
KLLM: Fast LLM Inference with K-Means Quantization
by: Wu, Xueying, et al.
Published: (2025)
by: Wu, Xueying, et al.
Published: (2025)
Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference
by: Yubeaton, Patrick, et al.
Published: (2025)
by: Yubeaton, Patrick, et al.
Published: (2025)
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design
by: Xia, Haojun, et al.
Published: (2024)
by: Xia, Haojun, et al.
Published: (2024)
CATransformers: Carbon Aware Transformers Through Joint Model-Hardware Optimization
by: Wang, Irene, et al.
Published: (2025)
by: Wang, Irene, et al.
Published: (2025)
When Forgetting Builds Reliability: LLM Unlearning for Reliable Hardware Code Generation
by: Liang, Yiwen, et al.
Published: (2025)
by: Liang, Yiwen, et al.
Published: (2025)
MPM-LLM4DSE: Reaching the Pareto Frontier in HLS with Multimodal Learning and LLM-Driven Exploration
by: Xu, Lei, et al.
Published: (2026)
by: Xu, Lei, et al.
Published: (2026)
EVA: Accelerating LLM Decoding via an Efficient Vector Quantization Architecture
by: Duan, Bowen, et al.
Published: (2026)
by: Duan, Bowen, et al.
Published: (2026)
P3-LLM: An Integrated NPU-PIM Accelerator for Edge LLM Inference Using Hybrid Numerical Formats
by: Chen, Yuzong, et al.
Published: (2025)
by: Chen, Yuzong, et al.
Published: (2025)
Designing Efficient LLM Accelerators for Edge Devices
by: Haris, Jude, et al.
Published: (2024)
by: Haris, Jude, et al.
Published: (2024)
Similar Items
-
LLM-VeriPPA: Power, Performance, and Area Optimization aware Verilog Code Generation with Large Language Models
by: Thorat, Kiran, et al.
Published: (2025) -
LLM-based AI Agent for Sizing of Analog and Mixed Signal Circuit
by: Liu, Chang, et al.
Published: (2025) -
GFormer: Accelerating Large Language Models with Optimized Transformers on Gaudi Processors
by: Zhang, Chengming, et al.
Published: (2024) -
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation
by: Zhang, Zixi, et al.
Published: (2023) -
EEsizer: LLM-Based AI Agent for Sizing of Analog and Mixed Signal Circuit
by: Liu, Chang, et al.
Published: (2025)