Guardado en:
| Autores principales: | Huang, Junlin, Fang, Wenyi, Tang, Zhenheng, Wang, Yuxin, Kang, Xueze, Zheng, Yang, Li, Bo, Chu, Xiaowen |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2602.00969 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning
por: Tang, Zichen, et al.
Publicado: (2024)
por: Tang, Zichen, et al.
Publicado: (2024)
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
por: Dong, Peijie, et al.
Publicado: (2025)
por: Dong, Peijie, et al.
Publicado: (2025)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression
por: Tang, Zhenheng, et al.
Publicado: (2024)
por: Tang, Zhenheng, et al.
Publicado: (2024)
Rethinking Deep Research from the Perspective of Web Content Distribution Matching
por: Yu, Zixuan, et al.
Publicado: (2026)
por: Yu, Zixuan, et al.
Publicado: (2026)
FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models
por: Pan, Xinglin, et al.
Publicado: (2025)
por: Pan, Xinglin, et al.
Publicado: (2025)
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
por: Li, Qi, et al.
Publicado: (2025)
por: Li, Qi, et al.
Publicado: (2025)
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
por: Tang, Zhenheng, et al.
Publicado: (2025)
por: Tang, Zhenheng, et al.
Publicado: (2025)
FedImpro: Measuring and Improving Client Update in Federated Learning
por: Tang, Zhenheng, et al.
Publicado: (2024)
por: Tang, Zhenheng, et al.
Publicado: (2024)
PatternKV: Flattening KV Representation Expands Quantization Headroom
por: Zhang, Ji, et al.
Publicado: (2025)
por: Zhang, Ji, et al.
Publicado: (2025)
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models
por: Dong, Peijie, et al.
Publicado: (2024)
por: Dong, Peijie, et al.
Publicado: (2024)
RouteMark: A Fingerprint for Intellectual Property Attribution in Routing-based Model Merging
por: He, Xin, et al.
Publicado: (2025)
por: He, Xin, et al.
Publicado: (2025)
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing
por: Lai, Kunfeng, et al.
Publicado: (2025)
por: Lai, Kunfeng, et al.
Publicado: (2025)
FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion
por: Tang, Zhenheng, et al.
Publicado: (2024)
por: Tang, Zhenheng, et al.
Publicado: (2024)
Fault-Tolerant Hybrid-Parallel Training at Scale with Reliable and Efficient In-memory Checkpointing
por: Wang, Yuxin, et al.
Publicado: (2023)
por: Wang, Yuxin, et al.
Publicado: (2023)
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
por: Zhang, Yi, et al.
Publicado: (2024)
por: Zhang, Yi, et al.
Publicado: (2024)
Dissecting Outlier Dynamics in LLM NVFP4 Pretraining
por: Dong, Peijie, et al.
Publicado: (2026)
por: Dong, Peijie, et al.
Publicado: (2026)
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
por: Dong, Peijie, et al.
Publicado: (2024)
por: Dong, Peijie, et al.
Publicado: (2024)
Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
por: Du, Dayou, et al.
Publicado: (2024)
por: Du, Dayou, et al.
Publicado: (2024)
Spectral Flattening Is All Muon Needs: How Orthogonalization Controls Learning Rate and Convergence
por: Nguyen, Tien-Phat, et al.
Publicado: (2026)
por: Nguyen, Tien-Phat, et al.
Publicado: (2026)
BurstGPT: A Real-world Workload Dataset to Optimize LLM Serving Systems
por: Wang, Yuxin, et al.
Publicado: (2024)
por: Wang, Yuxin, et al.
Publicado: (2024)
DreamDDP: Accelerating Data Parallel Distributed LLM Training with Layer-wise Scheduled Partial Synchronization
por: Tang, Zhenheng, et al.
Publicado: (2025)
por: Tang, Zhenheng, et al.
Publicado: (2025)
A Mechanism Study of Delayed Loss Spikes in Batch-Normalized Linear Models
por: Gao, Peifeng, et al.
Publicado: (2026)
por: Gao, Peifeng, et al.
Publicado: (2026)
Flattening Hierarchies with Policy Bootstrapping
por: Zhou, John L., et al.
Publicado: (2025)
por: Zhou, John L., et al.
Publicado: (2025)
Flatten Graphs as Sequences: Transformers are Scalable Graph Generators
por: Chen, Dexiong, et al.
Publicado: (2025)
por: Chen, Dexiong, et al.
Publicado: (2025)
Data-Driven Graph Filters via Adaptive Spectral Shaping
por: Sandfelder, Dylan, et al.
Publicado: (2026)
por: Sandfelder, Dylan, et al.
Publicado: (2026)
Frequency-Guided Action Diffusion via Sub-Frequency Manifold Traversal
por: Wang, Junlin
Publicado: (2026)
por: Wang, Junlin
Publicado: (2026)
SEF-MK: Speaker-Embedding-Free Voice Anonymization through Multi-k-means Quantization
por: Tang, Beilong, et al.
Publicado: (2025)
por: Tang, Beilong, et al.
Publicado: (2025)
Spectral Algorithms in Misspecified Regression: Convergence under Covariate Shift
por: Liu, Ren-Rui, et al.
Publicado: (2025)
por: Liu, Ren-Rui, et al.
Publicado: (2025)
DimGrow: Memory-Efficient Field-level Embedding Dimension Search
por: Huang, Yihong, et al.
Publicado: (2025)
por: Huang, Yihong, et al.
Publicado: (2025)
Stochastic Nonlinear Control via Finite-dimensional Spectral Dynamic Embedding
por: Ren, Zhaolin, et al.
Publicado: (2023)
por: Ren, Zhaolin, et al.
Publicado: (2023)
Scalable and Adaptive Spectral Embedding for Attributed Graph Clustering
por: Liu, Yunhui, et al.
Publicado: (2024)
por: Liu, Yunhui, et al.
Publicado: (2024)
Quantizing Text-attributed Graphs for Semantic-Structural Integration
por: Bo, Jianyuan, et al.
Publicado: (2025)
por: Bo, Jianyuan, et al.
Publicado: (2025)
Theory-optimal Quantization Based on Flatness
por: Huang, Xiusheng, et al.
Publicado: (2026)
por: Huang, Xiusheng, et al.
Publicado: (2026)
Tree-Regularized Tabular Embeddings
por: Li, Xuan, et al.
Publicado: (2024)
por: Li, Xuan, et al.
Publicado: (2024)
Activity-aware Human Mobility Prediction with Hierarchical Graph Attention Recurrent Network
por: Tang, Yihong, et al.
Publicado: (2022)
por: Tang, Yihong, et al.
Publicado: (2022)
Overcomplete Tensor Decomposition via Koszul-Young Flattenings
por: Kothari, Pravesh K., et al.
Publicado: (2024)
por: Kothari, Pravesh K., et al.
Publicado: (2024)
Is Your LLM-as-a-Recommender Agent Trustable? LLMs' Recommendation is Easily Hacked by Biases (Preferences)
por: Tang, Zichen, et al.
Publicado: (2026)
por: Tang, Zichen, et al.
Publicado: (2026)
TempoGPT: Enhancing Time Series Reasoning via Quantizing Embedding
por: Zhang, Haochuan, et al.
Publicado: (2025)
por: Zhang, Haochuan, et al.
Publicado: (2025)
One-shot Federated Learning Methods: A Practical Guide
por: Liu, Xiang, et al.
Publicado: (2025)
por: Liu, Xiang, et al.
Publicado: (2025)
An Empirical Study of Qwen3 Quantization
por: Zheng, Xingyu, et al.
Publicado: (2025)
por: Zheng, Xingyu, et al.
Publicado: (2025)
Ejemplares similares
-
Bandwidth-Aware and Overlap-Weighted Compression for Communication-Efficient Federated Learning
por: Tang, Zichen, et al.
Publicado: (2024) -
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
por: Dong, Peijie, et al.
Publicado: (2025) -
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression
por: Tang, Zhenheng, et al.
Publicado: (2024) -
Rethinking Deep Research from the Perspective of Web Content Distribution Matching
por: Yu, Zixuan, et al.
Publicado: (2026) -
FSMoE: A Flexible and Scalable Training System for Sparse Mixture-of-Experts Models
por: Pan, Xinglin, et al.
Publicado: (2025)