Guardado en:
| Autores principales: | Liu, Zeyu, Li, Yan, Zhang, Yunquan, Zhang, Boyang, Jiang, Guoyong, Zhang, Xin, Xiao, Limin, Zhang, Weifeng, Cheng, Daning |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2506.12037 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
MoE-DisCo:Low Economy Cost Training Mixture-of-Experts Models
por: Ye, Xin, et al.
Publicado: (2026)
por: Ye, Xin, et al.
Publicado: (2026)
MoQE: Improve Quantization Model performance via Mixture of Quantization Experts
por: Zhang, Jinhao, et al.
Publicado: (2025)
por: Zhang, Jinhao, et al.
Publicado: (2025)
Lossless Model Compression via Joint Low-Rank Factorization Optimization
por: Zhang, Boyang, et al.
Publicado: (2024)
por: Zhang, Boyang, et al.
Publicado: (2024)
A Qualitative Test-Risk Mechanism for Scaling Behavior in Normalized Residual Networks
por: Cheng, Daning, et al.
Publicado: (2026)
por: Cheng, Daning, et al.
Publicado: (2026)
Compression for Better: A General and Stable Lossless Compression Framework
por: Zhang, Boyang, et al.
Publicado: (2024)
por: Zhang, Boyang, et al.
Publicado: (2024)
FP=xINT:Representing Neural Networks via Low-Bit Series Basis Functions
por: Zhang, Boyang, et al.
Publicado: (2024)
por: Zhang, Boyang, et al.
Publicado: (2024)
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
por: Zhang, Jinhao Zhang Yunquan, et al.
Publicado: (2026)
por: Zhang, Jinhao Zhang Yunquan, et al.
Publicado: (2026)
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
por: Zhang, Boyang, et al.
Publicado: (2025)
por: Zhang, Boyang, et al.
Publicado: (2025)
CALM: A CKA-Guided Adaptive Layer-Wise Modularization Framework for LLM Quantization
por: Zhang, Jinhao, et al.
Publicado: (2025)
por: Zhang, Jinhao, et al.
Publicado: (2025)
Rethinking Parameter Sharing as Graph Coloring for Structured Compression
por: Zhang, Boyang, et al.
Publicado: (2025)
por: Zhang, Boyang, et al.
Publicado: (2025)
Can the capability of Large Language Models be described by human ability? A Meta Study
por: Zan, Mingrui, et al.
Publicado: (2025)
por: Zan, Mingrui, et al.
Publicado: (2025)
FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
por: Liu, Junkang, et al.
Publicado: (2026)
por: Liu, Junkang, et al.
Publicado: (2026)
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model
por: Seawead, Team, et al.
Publicado: (2025)
por: Seawead, Team, et al.
Publicado: (2025)
Breaking Model Lock-in: Cost-Efficient Zero-Shot LLM Routing via a Universal Latent Space
por: Yan, Cheng, et al.
Publicado: (2026)
por: Yan, Cheng, et al.
Publicado: (2026)
A Block-Coordinate Descent EMO Algorithm: Theoretical and Empirical Analysis
por: Doerr, Benjamin, et al.
Publicado: (2024)
por: Doerr, Benjamin, et al.
Publicado: (2024)
Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems
por: Zhang, Yuzhe, et al.
Publicado: (2026)
por: Zhang, Yuzhe, et al.
Publicado: (2026)
DepCap: Adaptive Block-Wise Parallel Decoding for Efficient Diffusion LM Inference
por: Xia, Xiang, et al.
Publicado: (2026)
por: Xia, Xiang, et al.
Publicado: (2026)
Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
por: Zhang, Qizhe, et al.
Publicado: (2024)
por: Zhang, Qizhe, et al.
Publicado: (2024)
Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
por: Deng, Yunlong, et al.
Publicado: (2025)
por: Deng, Yunlong, et al.
Publicado: (2025)
Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server
por: Daning, Cheng, et al.
Publicado: (2018)
por: Daning, Cheng, et al.
Publicado: (2018)
SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
por: Xia, Junhao, et al.
Publicado: (2025)
por: Xia, Junhao, et al.
Publicado: (2025)
Memory-Efficient LLM Training with Online Subspace Descent
por: Liang, Kaizhao, et al.
Publicado: (2024)
por: Liang, Kaizhao, et al.
Publicado: (2024)
Cost-Awareness in Tree-Search LLM Planning: A Systematic Study
por: Zhang, Zihao, et al.
Publicado: (2025)
por: Zhang, Zihao, et al.
Publicado: (2025)
CDQuant: Greedy Coordinate Descent for Accurate LLM Quantization
por: Nair, Pranav Ajit, et al.
Publicado: (2024)
por: Nair, Pranav Ajit, et al.
Publicado: (2024)
Efficient Training of Large-Scale AI Models Through Federated Mixture-of-Experts: A System-Level Approach
por: Chen, Xiaobing, et al.
Publicado: (2025)
por: Chen, Xiaobing, et al.
Publicado: (2025)
Focusing on Language: Revealing and Exploiting Language Attention Heads in Multilingual Large Language Models
por: Liu, Xin, et al.
Publicado: (2025)
por: Liu, Xin, et al.
Publicado: (2025)
CaRoBio: 3D Cable Routing with a Bio-inspired Gripper Fingernail
por: Zuo, Jiahui, et al.
Publicado: (2025)
por: Zuo, Jiahui, et al.
Publicado: (2025)
Efficient Agents: Building Effective Agents While Reducing Cost
por: Wang, Ningning, et al.
Publicado: (2025)
por: Wang, Ningning, et al.
Publicado: (2025)
Task-Aware KV Compression For Cost-Effective Long Video Understanding
por: Qin, Minghao, et al.
Publicado: (2025)
por: Qin, Minghao, et al.
Publicado: (2025)
Towards Adaptive, Scalable, and Robust Coordination of LLM Agents: A Dynamic Ad-Hoc Networking Perspective
por: Li, Rui, et al.
Publicado: (2026)
por: Li, Rui, et al.
Publicado: (2026)
Cequel: Cost-Effective Querying of Large Language Models for Text Clustering
por: Wang, Hongtao, et al.
Publicado: (2025)
por: Wang, Hongtao, et al.
Publicado: (2025)
A Universal Banach--Bregman Framework for Stochastic Iterations: Unifying Stochastic Mirror Descent, Learning and LLM Training
por: Zhang, Johnny R., et al.
Publicado: (2025)
por: Zhang, Johnny R., et al.
Publicado: (2025)
CompressionAttack: Exploiting Prompt Compression as a New Attack Surface in LLM-Powered Agents
por: Liu, Zesen, et al.
Publicado: (2025)
por: Liu, Zesen, et al.
Publicado: (2025)
M3-Net: A Cost-Effective Graph-Free MLP-Based Model for Traffic Prediction
por: Jin, Guangyin, et al.
Publicado: (2025)
por: Jin, Guangyin, et al.
Publicado: (2025)
Trapezoidal Gradient Descent for Effective Reinforcement Learning in Spiking Networks
por: Pan, Yuhao, et al.
Publicado: (2024)
por: Pan, Yuhao, et al.
Publicado: (2024)
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
por: Zhang, Tianao, et al.
Publicado: (2025)
por: Zhang, Tianao, et al.
Publicado: (2025)
AviationLLM: An LLM-based Knowledge System for Aviation Training
por: Wan, Jia'ang, et al.
Publicado: (2025)
por: Wan, Jia'ang, et al.
Publicado: (2025)
Fault-Tolerant Sandboxing for AI Coding Agents: A Transactional Approach to Safe Autonomous Execution
por: Yan, Boyang
Publicado: (2025)
por: Yan, Boyang
Publicado: (2025)
Open-Medical-R1: How to Choose Data for RLVR Training at Medicine Domain
por: Qiu, Zhongxi, et al.
Publicado: (2025)
por: Qiu, Zhongxi, et al.
Publicado: (2025)
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
por: Cheng, Wenhua, et al.
Publicado: (2023)
por: Cheng, Wenhua, et al.
Publicado: (2023)
Ejemplares similares
-
MoE-DisCo:Low Economy Cost Training Mixture-of-Experts Models
por: Ye, Xin, et al.
Publicado: (2026) -
MoQE: Improve Quantization Model performance via Mixture of Quantization Experts
por: Zhang, Jinhao, et al.
Publicado: (2025) -
Lossless Model Compression via Joint Low-Rank Factorization Optimization
por: Zhang, Boyang, et al.
Publicado: (2024) -
A Qualitative Test-Risk Mechanism for Scaling Behavior in Normalized Residual Networks
por: Cheng, Daning, et al.
Publicado: (2026) -
Compression for Better: A General and Stable Lossless Compression Framework
por: Zhang, Boyang, et al.
Publicado: (2024)