:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Liu, Zeyu, Li, Yan, Zhang, Yunquan, Zhang, Boyang, Jiang, Guoyong, Zhang, Xin, Xiao, Limin, Zhang, Weifeng, Cheng, Daning
Formato:	Preprint
Publicado:	2025
Materias:	Machine Learning Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2506.12037
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

MoE-DisCo:Low Economy Cost Training Mixture-of-Experts Models
por: Ye, Xin, et al.
Publicado: (2026)

MoQE: Improve Quantization Model performance via Mixture of Quantization Experts
por: Zhang, Jinhao, et al.
Publicado: (2025)

Lossless Model Compression via Joint Low-Rank Factorization Optimization
por: Zhang, Boyang, et al.
Publicado: (2024)

A Qualitative Test-Risk Mechanism for Scaling Behavior in Normalized Residual Networks
por: Cheng, Daning, et al.
Publicado: (2026)

Compression for Better: A General and Stable Lossless Compression Framework
por: Zhang, Boyang, et al.
Publicado: (2024)

FP=xINT:Representing Neural Networks via Low-Bit Series Basis Functions
por: Zhang, Boyang, et al.
Publicado: (2024)

HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
por: Zhang, Jinhao Zhang Yunquan, et al.
Publicado: (2026)

A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
por: Zhang, Boyang, et al.
Publicado: (2025)

CALM: A CKA-Guided Adaptive Layer-Wise Modularization Framework for LLM Quantization
por: Zhang, Jinhao, et al.
Publicado: (2025)

Rethinking Parameter Sharing as Graph Coloring for Structured Compression
por: Zhang, Boyang, et al.
Publicado: (2025)

Can the capability of Large Language Models be described by human ability? A Meta Study
por: Zan, Mingrui, et al.
Publicado: (2025)

FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
por: Liu, Junkang, et al.
Publicado: (2026)

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
por: Seawead, Team, et al.
Publicado: (2025)

Breaking Model Lock-in: Cost-Efficient Zero-Shot LLM Routing via a Universal Latent Space
por: Yan, Cheng, et al.
Publicado: (2026)

A Block-Coordinate Descent EMO Algorithm: Theoretical and Empirical Analysis
por: Doerr, Benjamin, et al.
Publicado: (2024)

Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems
por: Zhang, Yuzhe, et al.
Publicado: (2026)

DepCap: Adaptive Block-Wise Parallel Decoding for Efficient Diffusion LM Inference
por: Xia, Xiang, et al.
Publicado: (2026)

Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
por: Zhang, Qizhe, et al.
Publicado: (2024)

Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
por: Deng, Yunlong, et al.
Publicado: (2025)

Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server
por: Daning, Cheng, et al.
Publicado: (2018)

SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
por: Xia, Junhao, et al.
Publicado: (2025)

Memory-Efficient LLM Training with Online Subspace Descent
por: Liang, Kaizhao, et al.
Publicado: (2024)

Cost-Awareness in Tree-Search LLM Planning: A Systematic Study
por: Zhang, Zihao, et al.
Publicado: (2025)

CDQuant: Greedy Coordinate Descent for Accurate LLM Quantization
por: Nair, Pranav Ajit, et al.
Publicado: (2024)

Efficient Training of Large-Scale AI Models Through Federated Mixture-of-Experts: A System-Level Approach
por: Chen, Xiaobing, et al.
Publicado: (2025)

Focusing on Language: Revealing and Exploiting Language Attention Heads in Multilingual Large Language Models
por: Liu, Xin, et al.
Publicado: (2025)

CaRoBio: 3D Cable Routing with a Bio-inspired Gripper Fingernail
por: Zuo, Jiahui, et al.
Publicado: (2025)

Efficient Agents: Building Effective Agents While Reducing Cost
por: Wang, Ningning, et al.
Publicado: (2025)

Task-Aware KV Compression For Cost-Effective Long Video Understanding
por: Qin, Minghao, et al.
Publicado: (2025)

Towards Adaptive, Scalable, and Robust Coordination of LLM Agents: A Dynamic Ad-Hoc Networking Perspective
por: Li, Rui, et al.
Publicado: (2026)

Cequel: Cost-Effective Querying of Large Language Models for Text Clustering
por: Wang, Hongtao, et al.
Publicado: (2025)

A Universal Banach--Bregman Framework for Stochastic Iterations: Unifying Stochastic Mirror Descent, Learning and LLM Training
por: Zhang, Johnny R., et al.
Publicado: (2025)

CompressionAttack: Exploiting Prompt Compression as a New Attack Surface in LLM-Powered Agents
por: Liu, Zesen, et al.
Publicado: (2025)

M3-Net: A Cost-Effective Graph-Free MLP-Based Model for Traffic Prediction
por: Jin, Guangyin, et al.
Publicado: (2025)

Trapezoidal Gradient Descent for Effective Reinforcement Learning in Spiking Networks
por: Pan, Yuhao, et al.
Publicado: (2024)

Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
por: Zhang, Tianao, et al.
Publicado: (2025)

AviationLLM: An LLM-based Knowledge System for Aviation Training
por: Wan, Jia'ang, et al.
Publicado: (2025)

Fault-Tolerant Sandboxing for AI Coding Agents: A Transactional Approach to Safe Autonomous Execution
por: Yan, Boyang
Publicado: (2025)

Open-Medical-R1: How to Choose Data for RLVR Training at Medicine Domain
por: Qiu, Zhongxi, et al.
Publicado: (2025)

Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
por: Cheng, Wenhua, et al.
Publicado: (2023)