Saved in:
| Main Authors: | Lu, Guanxi, Chen, Hao Mark, Que, Zhiqiang, Luk, Wayne, Fan, Hongxiang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.22483 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FastTTS: Accelerating Test-Time Scaling for Edge LLM Reasoning
by: Chen, Hao Mark, et al.
Published: (2025)
by: Chen, Hao Mark, et al.
Published: (2025)
MetaML-Pro: Cross-Stage Design Flow Automation for Efficient Deep Learning Acceleration
by: Que, Zhiqiang, et al.
Published: (2025)
by: Que, Zhiqiang, et al.
Published: (2025)
Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGA
by: Zhang, Zehuan, et al.
Published: (2024)
by: Zhang, Zehuan, et al.
Published: (2024)
AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size
by: Lu, Guanxi, et al.
Published: (2025)
by: Lu, Guanxi, et al.
Published: (2025)
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
by: Chen, Hao Mark, et al.
Published: (2025)
by: Chen, Hao Mark, et al.
Published: (2025)
LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics
by: Que, Zhiqiang, et al.
Published: (2022)
by: Que, Zhiqiang, et al.
Published: (2022)
Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA
by: Chen, Hao Mark, et al.
Published: (2024)
by: Chen, Hao Mark, et al.
Published: (2024)
FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization
by: Chen, Hao Mark, et al.
Published: (2025)
by: Chen, Hao Mark, et al.
Published: (2025)
da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs
by: Sun, Chang, et al.
Published: (2025)
by: Sun, Chang, et al.
Published: (2025)
Progressive Mixed-Precision Decoding for Efficient LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)
by: Chen, Hao Mark, et al.
Published: (2024)
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
by: Chen, Hao Mark, et al.
Published: (2024)
by: Chen, Hao Mark, et al.
Published: (2024)
HGQ: High Granularity Quantization for Real-time Neural Networks on FPGAs
by: Sun, Chang, et al.
Published: (2024)
by: Sun, Chang, et al.
Published: (2024)
Dynamic Expert Sharing: Decoupling Memory from Parallelism in Mixture-of-Experts Diffusion LLMs
by: Chen, Hao Mark, et al.
Published: (2026)
by: Chen, Hao Mark, et al.
Published: (2026)
HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference
by: Sun, Chang, et al.
Published: (2026)
by: Sun, Chang, et al.
Published: (2026)
Enhancing LLM-based Quantum Code Generation with Multi-Agent Optimization and Quantum Error Correction
by: Campbell, Charlie, et al.
Published: (2025)
by: Campbell, Charlie, et al.
Published: (2025)
Algorithm and Hardware Co-Design for Efficient Complex-Valued Uncertainty Estimation
by: Zhang, Zehuan, et al.
Published: (2026)
by: Zhang, Zehuan, et al.
Published: (2026)
Sub-microsecond Transformers for Jet Tagging on FPGAs
by: Laatu, Lauri, et al.
Published: (2025)
by: Laatu, Lauri, et al.
Published: (2025)
JetFormer: A Scalable and Efficient Transformer for Jet Tagging from Offline Analysis to FPGA Triggers
by: Zheng, Ruoqing, et al.
Published: (2026)
by: Zheng, Ruoqing, et al.
Published: (2026)
JEDI-linear: Fast and Efficient Graph Neural Networks for Jet Tagging on FPGAs
by: Que, Zhiqiang, et al.
Published: (2025)
by: Que, Zhiqiang, et al.
Published: (2025)
Low-Precision Training of Large Language Models: Methods, Challenges, and Opportunities
by: Hao, Zhiwei, et al.
Published: (2025)
by: Hao, Zhiwei, et al.
Published: (2025)
Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis
by: Gai, Jiahao, et al.
Published: (2025)
by: Gai, Jiahao, et al.
Published: (2025)
Scalable Time-Series Causal Discovery with Approximate Causal Ordering
by: Jiao, Ziyang, et al.
Published: (2024)
by: Jiao, Ziyang, et al.
Published: (2024)
VCDF: A Validated Consensus-Driven Framework for Time Series Causal Discovery
by: Yu, Gene, et al.
Published: (2026)
by: Yu, Gene, et al.
Published: (2026)
Context Memorization for Efficient Long Context Generation
by: Okoshi, Yasuyuki, et al.
Published: (2026)
by: Okoshi, Yasuyuki, et al.
Published: (2026)
Accelerating 3D Gaussian Splatting with Neural Sorting and Axis-Oriented Rasterization
by: Wang, Zhican, et al.
Published: (2025)
by: Wang, Zhican, et al.
Published: (2025)
Robust Time Series Causal Discovery for Agent-Based Model Validation
by: Yu, Gene, et al.
Published: (2024)
by: Yu, Gene, et al.
Published: (2024)
GNN-Transformer Cooperative Architecture for Trustworthy Graph Contrastive Learning
by: Liang, Jianqing, et al.
Published: (2024)
by: Liang, Jianqing, et al.
Published: (2024)
Exploring FPGA designs for MX and beyond
by: Samson, Ebby, et al.
Published: (2024)
by: Samson, Ebby, et al.
Published: (2024)
ASPO: Constraint-Aware Bayesian Optimization for FPGA-based Soft Processors
by: Wu, Haoran, et al.
Published: (2025)
by: Wu, Haoran, et al.
Published: (2025)
Graph Foundation Models: Concepts, Opportunities and Challenges
by: Liu, Jiawei, et al.
Published: (2023)
by: Liu, Jiawei, et al.
Published: (2023)
Advancing AI-assisted Hardware Design with Hierarchical Decentralized Training and Personalized Inference-Time Optimization
by: Chen, Hao Mark, et al.
Published: (2025)
by: Chen, Hao Mark, et al.
Published: (2025)
Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning
by: Chen, Xinye
Published: (2025)
by: Chen, Xinye
Published: (2025)
MixKVQ: Query-Aware Mixed-Precision KV Cache Quantization for Long-Context Reasoning
by: Zhang, Tao, et al.
Published: (2025)
by: Zhang, Tao, et al.
Published: (2025)
Blockchain-enabled Trustworthy Federated Unlearning
by: Lin, Yijing, et al.
Published: (2024)
by: Lin, Yijing, et al.
Published: (2024)
SAE: Single Architecture Ensemble Neural Networks
by: Ferianc, Martin, et al.
Published: (2024)
by: Ferianc, Martin, et al.
Published: (2024)
MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging
by: Wang, Jiapeng, et al.
Published: (2026)
by: Wang, Jiapeng, et al.
Published: (2026)
Identifying Trustworthiness Challenges in Deep Learning Models for Continental-Scale Water Quality Prediction
by: Xia, Xiaobo, et al.
Published: (2025)
by: Xia, Xiaobo, et al.
Published: (2025)
Open Challenges and Opportunities in Federated Foundation Models Towards Biomedical Healthcare
by: Li, Xingyu, et al.
Published: (2024)
by: Li, Xingyu, et al.
Published: (2024)
Trustworthy Artificial Intelligence in the Context of Metrology
by: Adel, Tameem, et al.
Published: (2024)
by: Adel, Tameem, et al.
Published: (2024)
APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs
by: Bouzouad, Meriem, et al.
Published: (2026)
by: Bouzouad, Meriem, et al.
Published: (2026)
Similar Items
-
FastTTS: Accelerating Test-Time Scaling for Edge LLM Reasoning
by: Chen, Hao Mark, et al.
Published: (2025) -
MetaML-Pro: Cross-Stage Design Flow Automation for Efficient Deep Learning Acceleration
by: Que, Zhiqiang, et al.
Published: (2025) -
Hardware-Aware Neural Dropout Search for Reliable Uncertainty Prediction on FPGA
by: Zhang, Zehuan, et al.
Published: (2024) -
AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size
by: Lu, Guanxi, et al.
Published: (2025) -
Rethinking Optimal Verification Granularity for Compute-Efficient Test-Time Scaling
by: Chen, Hao Mark, et al.
Published: (2025)