Saved in:
| Main Authors: | Lyu, Shuyan, Wu, Zhanzimo, Du, Junliang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.27651 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models
by: Xiao, He, et al.
Published: (2025)
by: Xiao, He, et al.
Published: (2025)
VISAT: Benchmarking Adversarial and Distribution Shift Robustness in Traffic Sign Recognition with Visual Attributes
by: Yu, Simon, et al.
Published: (2025)
by: Yu, Simon, et al.
Published: (2025)
LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training
by: Gwak, Minju, et al.
Published: (2026)
by: Gwak, Minju, et al.
Published: (2026)
Efficient Knowledge Deletion from Trained Models through Layer-wise Partial Machine Unlearning
by: Gogineni, Vinay Chakravarthi, et al.
Published: (2024)
by: Gogineni, Vinay Chakravarthi, et al.
Published: (2024)
A Layer-wise Analysis of Supervised Fine-Tuning
by: Zhao, Qinghua, et al.
Published: (2026)
by: Zhao, Qinghua, et al.
Published: (2026)
Nonlinearity, Feedback and Uniform Consistency in Causal Structural Learning
by: Wang, Shuyan
Published: (2023)
by: Wang, Shuyan
Published: (2023)
DropEdge not Foolproof: Effective Augmentation Method for Signed Graph Neural Networks
by: Zhang, Zeyu, et al.
Published: (2024)
by: Zhang, Zeyu, et al.
Published: (2024)
Stochastic Layer-wise Learning: Scalable and Efficient Alternative to Backpropagation
by: Yin, Bojian, et al.
Published: (2025)
by: Yin, Bojian, et al.
Published: (2025)
Resource-efficient Layer-wise Federated Self-supervised Learning
by: Tun, Ye Lin, et al.
Published: (2024)
by: Tun, Ye Lin, et al.
Published: (2024)
An Efficient Training Algorithm for Models with Block-wise Sparsity
by: Zhu, Ding, et al.
Published: (2025)
by: Zhu, Ding, et al.
Published: (2025)
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization
by: Lee, Deokjae, et al.
Published: (2024)
by: Lee, Deokjae, et al.
Published: (2024)
LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation
by: Shen, Yiqun, et al.
Published: (2025)
by: Shen, Yiqun, et al.
Published: (2025)
Augmented Lagrangian Multiplier Network for State-wise Safety in Reinforcement Learning
by: Zhang, Jiaming, et al.
Published: (2026)
by: Zhang, Jiaming, et al.
Published: (2026)
DP-LLM: Runtime Model Adaptation with Dynamic Layer-wise Precision Assignment
by: Kwon, Sangwoo, et al.
Published: (2025)
by: Kwon, Sangwoo, et al.
Published: (2025)
LEVI: Generalizable Fine-tuning via Layer-wise Ensemble of Different Views
by: Roh, Yuji, et al.
Published: (2024)
by: Roh, Yuji, et al.
Published: (2024)
Greedy Sampling Is Provably Efficient for RLHF
by: Wu, Di, et al.
Published: (2025)
by: Wu, Di, et al.
Published: (2025)
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
by: Li, Ming, et al.
Published: (2025)
by: Li, Ming, et al.
Published: (2025)
Training Long-Context LLMs Efficiently via Chunk-wise Optimization
by: Li, Wenhao, et al.
Published: (2025)
by: Li, Wenhao, et al.
Published: (2025)
Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis
by: Fartale, Harshwardhan, et al.
Published: (2025)
by: Fartale, Harshwardhan, et al.
Published: (2025)
When Greedy Wins: Emergent Exploitation Bias in Meta-Bandit LLM Training
by: Chen, Sanxing, et al.
Published: (2025)
by: Chen, Sanxing, et al.
Published: (2025)
Less Greedy Equivalence Search
by: Ejaz, Adiba, et al.
Published: (2025)
by: Ejaz, Adiba, et al.
Published: (2025)
COMO: Closed-Loop Optical Molecule Recognition with Minimum Risk Training
by: Lyu, Zhuoqi, et al.
Published: (2026)
by: Lyu, Zhuoqi, et al.
Published: (2026)
GreedySnake: Accelerating SSD-Offloaded LLM Training with Efficient Scheduling and Optimizer Step Overlapping
by: Yin, Yishu, et al.
Published: (2025)
by: Yin, Yishu, et al.
Published: (2025)
DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
by: Shing, Makoto, et al.
Published: (2025)
by: Shing, Makoto, et al.
Published: (2025)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation
by: Weber, Leander, et al.
Published: (2023)
by: Weber, Leander, et al.
Published: (2023)
MID-L: Matrix-Interpolated Dropout Layer with Layer-wise Neuron Selection
by: Shaeri, Pouya, et al.
Published: (2025)
by: Shaeri, Pouya, et al.
Published: (2025)
SSPO: Self-traced Step-wise Preference Optimization for Process Supervision and Reasoning Compression
by: Xu, Yuyang, et al.
Published: (2025)
by: Xu, Yuyang, et al.
Published: (2025)
A Novel Multimodal RUL Framework for Remaining Useful Life Estimation with Layer-wise Explanations
by: Razzaq, Waleed, et al.
Published: (2025)
by: Razzaq, Waleed, et al.
Published: (2025)
Information Theoretic Adversarial Training of Large Language Models
by: Zhang, Yiwei, et al.
Published: (2026)
by: Zhang, Yiwei, et al.
Published: (2026)
Spectral Greedy Coresets for Graph Neural Networks
by: Ding, Mucong, et al.
Published: (2024)
by: Ding, Mucong, et al.
Published: (2024)
Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition
by: Gao, Xinyi, et al.
Published: (2024)
by: Gao, Xinyi, et al.
Published: (2024)
STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation
by: Wang, Yiming, et al.
Published: (2025)
by: Wang, Yiming, et al.
Published: (2025)
Information-Theoretic Policy Pre-Training with Empowerment
by: Schneider, Moritz, et al.
Published: (2025)
by: Schneider, Moritz, et al.
Published: (2025)
LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference
by: Kapadia, Shashank, et al.
Published: (2026)
by: Kapadia, Shashank, et al.
Published: (2026)
FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization
by: Lee, Jung Hyun, et al.
Published: (2023)
by: Lee, Jung Hyun, et al.
Published: (2023)
GUDA: Counterfactual Group-wise Training Data Attribution for Diffusion Models via Unlearning
by: Murata, Naoki, et al.
Published: (2026)
by: Murata, Naoki, et al.
Published: (2026)
Reflective Policy Optimization
by: Gan, Yaozhong, et al.
Published: (2024)
by: Gan, Yaozhong, et al.
Published: (2024)
Annealed Softmax Greedy in Many-Armed Bayesian Bandits
by: Overman, William, et al.
Published: (2026)
by: Overman, William, et al.
Published: (2026)
RAST: A Retrieval Augmented Spatio-Temporal Framework for Traffic Prediction
by: Ruan, Weilin, et al.
Published: (2025)
by: Ruan, Weilin, et al.
Published: (2025)
GRASP: group-Shapley feature selection for patients
by: Luo, Yuheng, et al.
Published: (2026)
by: Luo, Yuheng, et al.
Published: (2026)
Similar Items
-
Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models
by: Xiao, He, et al.
Published: (2025) -
VISAT: Benchmarking Adversarial and Distribution Shift Robustness in Traffic Sign Recognition with Visual Attributes
by: Yu, Simon, et al.
Published: (2025) -
LaRA: Layer-wise Representation Analysis for Detecting Data Contamination in RL Post-Training
by: Gwak, Minju, et al.
Published: (2026) -
Efficient Knowledge Deletion from Trained Models through Layer-wise Partial Machine Unlearning
by: Gogineni, Vinay Chakravarthi, et al.
Published: (2024) -
A Layer-wise Analysis of Supervised Fine-Tuning
by: Zhao, Qinghua, et al.
Published: (2026)