Saved in:
| Main Authors: | Ji, Yuhao, Fang, Chao, Ma, Shaobo, Shao, Haikuo, Wang, Zhongfeng |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.12070 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
by: Ma, Shaobo, et al.
Published: (2024)
by: Ma, Shaobo, et al.
Published: (2024)
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration
by: Ma, Shaobo, et al.
Published: (2025)
by: Ma, Shaobo, et al.
Published: (2025)
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
by: Ji, Yuhao, et al.
Published: (2024)
by: Ji, Yuhao, et al.
Published: (2024)
AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design
by: Liang, Yanbiao, et al.
Published: (2025)
by: Liang, Yanbiao, et al.
Published: (2025)
FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization
by: Wang, Aotao, et al.
Published: (2025)
by: Wang, Aotao, et al.
Published: (2025)
An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
by: Shao, Haikuo, et al.
Published: (2024)
by: Shao, Haikuo, et al.
Published: (2024)
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
by: Shi, Huihong, et al.
Published: (2024)
by: Shi, Huihong, et al.
Published: (2024)
FinWorld: An All-in-One Open-Source Platform for End-to-End Financial AI Research and Deployment
by: Zhang, Wentao, et al.
Published: (2025)
by: Zhang, Wentao, et al.
Published: (2025)
NASH: Neural Architecture and Accelerator Search for Multiplication-Reduced Hybrid Models
by: Xu, Yang, et al.
Published: (2024)
by: Xu, Yang, et al.
Published: (2024)
Attack End-to-End Autonomous Driving through Module-Wise Noise
by: Wang, Lu, et al.
Published: (2024)
by: Wang, Lu, et al.
Published: (2024)
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models
by: Yao, Yifei, et al.
Published: (2024)
by: Yao, Yifei, et al.
Published: (2024)
PlatformX: An End-to-End Transferable Platform for Energy-Efficient Neural Architecture Search
by: Tu, Xiaolong, et al.
Published: (2025)
by: Tu, Xiaolong, et al.
Published: (2025)
An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems
by: Miao, Changhao, et al.
Published: (2025)
by: Miao, Changhao, et al.
Published: (2025)
GinAR: An End-To-End Multivariate Time Series Forecasting Model Suitable for Variable Missing
by: Yu, Chengqing, et al.
Published: (2024)
by: Yu, Chengqing, et al.
Published: (2024)
ExACT: An End-to-End Autonomous Excavator System Using Action Chunking With Transformers
by: Chen, Liangliang, et al.
Published: (2024)
by: Chen, Liangliang, et al.
Published: (2024)
L-MoE: End-to-End Training of a Lightweight Mixture of Low-Rank Adaptation Experts
by: Ji, Shihao, et al.
Published: (2025)
by: Ji, Shihao, et al.
Published: (2025)
Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems
by: Zhou, Ao, et al.
Published: (2024)
by: Zhou, Ao, et al.
Published: (2024)
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks
by: Geng, Xue, et al.
Published: (2024)
by: Geng, Xue, et al.
Published: (2024)
Towards Resource-Efficient LLMs: End-to-End Energy Accounting of Distillation Pipelines
by: Lambert, Katherine, et al.
Published: (2026)
by: Lambert, Katherine, et al.
Published: (2026)
Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format
by: Fang, Chao, et al.
Published: (2024)
by: Fang, Chao, et al.
Published: (2024)
An End-to-End Deep Reinforcement Learning Approach for Solving the Traveling Salesman Problem with Drones
by: Zeng, Taihelong, et al.
Published: (2025)
by: Zeng, Taihelong, et al.
Published: (2025)
EvaDrive: Evolutionary Adversarial Policy Optimization for End-to-End Autonomous Driving
by: Jiao, Siwen, et al.
Published: (2025)
by: Jiao, Siwen, et al.
Published: (2025)
BPQP: A Differentiable Convex Optimization Framework for Efficient End-to-End Learning
by: Pan, Jianming, et al.
Published: (2024)
by: Pan, Jianming, et al.
Published: (2024)
MC$^2$A: Enabling Algorithm-Hardware Co-Design for Efficient Markov Chain Monte Carlo Acceleration
by: Zhao, Shirui, et al.
Published: (2025)
by: Zhao, Shirui, et al.
Published: (2025)
FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention
by: Dai, Huangliang, et al.
Published: (2025)
by: Dai, Huangliang, et al.
Published: (2025)
Hamming Attention Distillation: Binarizing Keys and Queries for Efficient Long-Context Transformers
by: Horton, Mark, et al.
Published: (2025)
by: Horton, Mark, et al.
Published: (2025)
Bridging the Divide: End-to-End Sequence-Graph Learning
by: Chen, Yuen, et al.
Published: (2025)
by: Chen, Yuen, et al.
Published: (2025)
Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving
by: Zheng, Yinan, et al.
Published: (2026)
by: Zheng, Yinan, et al.
Published: (2026)
End-to-End Probabilistic Framework for Learning with Hard Constraints
by: Utkarsh, Utkarsh, et al.
Published: (2025)
by: Utkarsh, Utkarsh, et al.
Published: (2025)
ExChanGeAI: An End-to-End Platform and Efficient Foundation Model for Electrocardiogram Analysis and Fine-tuning
by: Bickmann, Lucas, et al.
Published: (2025)
by: Bickmann, Lucas, et al.
Published: (2025)
ElasticAI: Creating and Deploying Energy-Efficient Deep Learning Accelerator for Pervasive Computing
by: Qian, Chao, et al.
Published: (2024)
by: Qian, Chao, et al.
Published: (2024)
Stratos: An End-to-End Distillation Pipeline for Customized LLMs under Distributed Cloud Environments
by: Dai, Ziming, et al.
Published: (2025)
by: Dai, Ziming, et al.
Published: (2025)
DISCO: An End-to-End Bandit Framework for Personalised Discount Allocation
by: Zhang, Jason Shuo, et al.
Published: (2024)
by: Zhang, Jason Shuo, et al.
Published: (2024)
End-To-End Learning of Gaussian Mixture Priors for Diffusion Sampler
by: Blessing, Denis, et al.
Published: (2025)
by: Blessing, Denis, et al.
Published: (2025)
Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs
by: Sabih, Muhammad, et al.
Published: (2025)
by: Sabih, Muhammad, et al.
Published: (2025)
Towards Efficient Deployment of Hybrid SNNs on Neuromorphic and Edge AI Hardware
by: Seekings, James, et al.
Published: (2024)
by: Seekings, James, et al.
Published: (2024)
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
by: Lai, Ruihang, et al.
Published: (2023)
by: Lai, Ruihang, et al.
Published: (2023)
Self-Evolving Recommendation System: End-To-End Autonomous Model Optimization With LLM Agents
by: Wang, Haochen, et al.
Published: (2026)
by: Wang, Haochen, et al.
Published: (2026)
Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning
by: Braun, Dan, et al.
Published: (2024)
by: Braun, Dan, et al.
Published: (2024)
Learning to be Smooth: An End-to-End Differentiable Particle Smoother
by: Younis, Ali, et al.
Published: (2025)
by: Younis, Ali, et al.
Published: (2025)
Similar Items
-
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores
by: Ma, Shaobo, et al.
Published: (2024) -
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM Acceleration
by: Ma, Shaobo, et al.
Published: (2025) -
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
by: Ji, Yuhao, et al.
Published: (2024) -
AccLLM: Accelerating Long-Context LLM Inference Via Algorithm-Hardware Co-Design
by: Liang, Yanbiao, et al.
Published: (2025) -
FastMamba: A High-Speed and Efficient Mamba Accelerator on FPGA with Accurate Quantization
by: Wang, Aotao, et al.
Published: (2025)