:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Shengnan, Bai, Youhui, Zhang, Lin, Zhou, Pingyi, Zhao, Shixiong, Zhang, Gong, Wang, Sen, Chen, Renhai, Xu, Hua, Sun, Hongwei
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2405.17755
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Efficient Long-Context LLM Inference via KV Cache Clustering
by: Hu, Jie, et al.
Published: (2025)

BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference
by: Jin, Zewen, et al.
Published: (2025)

HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference
by: Gong, Ping, et al.
Published: (2025)

LiteCache: A Query Similarity-Driven, GPU-Centric KVCache Subsystem for Efficient LLM Inference
by: Yi, Jiawei, et al.
Published: (2025)

HyLRA: Hybrid Layer Reuse Attention for Efficient Long-Context Inference
by: Ai, Xuan, et al.
Published: (2026)

AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation
by: Tan, Haoyue, et al.
Published: (2026)

Accelerating Long-Tail Generation in Synchronous RLHF Training via Adaptive Tensor Parallelism
by: Zhao, Long, et al.
Published: (2026)

Lagom: Unleashing the Power of Communication and Computation Overlapping for Distributed LLM Training
by: Xu, Guanbin, et al.
Published: (2026)

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
by: Liu, Jiaheng, et al.
Published: (2024)

Train Short, Inference Long: Training-free Horizon Extension for Autoregressive Video Generation
by: Li, Jia, et al.
Published: (2026)

Making MoE-based LLM Inference Resilient with Tarragon
by: Zhang, Songyu, et al.
Published: (2026)

iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
by: Sun, Lin, et al.
Published: (2024)

WindowKV: Task-Adaptive Group-Wise KV Cache Window Selection for Efficient LLM Inference
by: Zuo, Youhui, et al.
Published: (2025)

MDP3: A Training-free Approach for List-wise Frame Selection in Video-LLMs
by: Sun, Hui, et al.
Published: (2025)

A Mathematical Theory of Top-$k$ Sparse Attention via Total Variation Distance
by: Tzachristas, Georgios, et al.
Published: (2025)

Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation
by: Wang, Zixin, et al.
Published: (2024)

Uncertainty-Aware Bayes' Rule and Its Applications
by: Wang, Shixiong
Published: (2023)

SARE: Sample-wise Adaptive Reasoning for Training-free Fine-grained Visual Recognition
by: Yang, Jingxiao, et al.
Published: (2026)

Scene-wise Adaptive Network for Dynamic Cold-start Scenes Optimization in CTR Prediction
by: Li, Wenhao, et al.
Published: (2024)

Bioinspired Directional Hydrogel‐Based High‐Performance Flexible Sensor for Multiple Jumping Pattern Detection in Athletic Training
by: Hanqi Wang, et al.
Published: (2025)

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
by: Hua, Ermo, et al.
Published: (2024)

Correlation-Aware Select and Merge Attention for Efficient Fine-Tuning and Context Length Extension
by: Wang, Ning, et al.
Published: (2024)

PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
by: Zhu, Dawei, et al.
Published: (2023)

Segmentation-guided Layer-wise Image Vectorization with Gradient Fills
by: Zhou, Hengyu, et al.
Published: (2024)

Using Cu‐Based Metal–Organic Framework as a Comprehensive and Powerful Antioxidant Nanozyme for Efficient Osteoarthritis Treatment
by: Bo Yu, et al.
Published: (2024)

Training-free Geometric Image Editing on Diffusion Models
by: Zhu, Hanshen, et al.
Published: (2025)

Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation
by: Chong, Yee Hin, et al.
Published: (2026)

SSMRadNet : A Sample-wise State-Space Framework for Efficient and Ultra-Light Radar Segmentation and Object Detection
by: Sen, Anuab, et al.
Published: (2025)

Distributional Robustness Bounds Generalization Errors
by: Wang, Shixiong, et al.
Published: (2022)

NEFT: A Unified Transformer Framework for Efficient Near-Field CSI Feedback in XL-MIMO Systems
by: Mao, Tianqi, et al.
Published: (2025)

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
by: Luo, Cheng, et al.
Published: (2025)

Tensor-Structured Bayesian Channel Prediction for Upper Mid-Band XL-MIMO Systems
by: Hou, Hongwei, et al.
Published: (2025)

Predicting Miscibility in Binary Compounds: A Machine Learning and Genetic Algorithm Study
by: Feng, Chiwen, et al.
Published: (2024)

Beam-Delay Domain Channel Estimation for mmWave XL-MIMO Systems
by: Hou, Hongwei, et al.
Published: (2023)

SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training
by: Liu, Yuliang, et al.
Published: (2025)

VecAttention: Vector-wise Sparse Attention for Accelerating Long Context Inference
by: Liu, Anmin, et al.
Published: (2026)

Near-Field Multiuser Beam Training for XL-MIMO: An End-to-End Interference-Aware Approach with Pilot Limitations
by: Li, Xinyang, et al.
Published: (2026)

Using a Functional Wool Keratin Photoresist to Build Iridescent and Fluorescent 3D Micro‐Pattern for Dual‐Mode Optical Anti‐Counterfeiting
by: Shuang Xia, et al.
Published: (2025)

Multimodal Contrastive Learning for 3D Object Classification and Part‐Segmentation by Leveraging V‐LLM and CNNs
by: Jiaxin Jiang, et al.
Published: (2025)

Dissecting Conditional Branch Predictors of Apple Firestorm and Qualcomm Oryon for Software Optimization and Architectural Analysis
by: Chen, Jiajie, et al.
Published: (2024)