Saved in:
| Main Author: | Luo, Charles |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.16443 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
by: Yuan, Yurun, et al.
Published: (2026)
by: Yuan, Yurun, et al.
Published: (2026)
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
by: Li, Ang, et al.
Published: (2025)
by: Li, Ang, et al.
Published: (2025)
Ratio-Variance Regularized Policy Optimization for Efficient LLM Fine-tuning
by: Luo, Yu, et al.
Published: (2026)
by: Luo, Yu, et al.
Published: (2026)
When the Domain Expert Has No Time and the LLM Developer Has No Clinical Expertise: Real-World Lessons from LLM Co-Design in a Safety-Net Hospital
by: Kothari, Avni, et al.
Published: (2025)
by: Kothari, Avni, et al.
Published: (2025)
Simple Yet Effective: An Information-Theoretic Approach to Multi-LLM Uncertainty Quantification
by: Kruse, Maya, et al.
Published: (2025)
by: Kruse, Maya, et al.
Published: (2025)
Breaking the Blocks: Continuous Low-Rank Decomposed Scaling for Unified LLM Quantization and Adaptation
by: Tang, Pingzhi, et al.
Published: (2026)
by: Tang, Pingzhi, et al.
Published: (2026)
The LLM Has Left The Chat: Evidence of Bail Preferences in Large Language Models
by: Ensign, Danielle, et al.
Published: (2025)
by: Ensign, Danielle, et al.
Published: (2025)
Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers
by: Sareen, Kusha, et al.
Published: (2025)
by: Sareen, Kusha, et al.
Published: (2025)
Graph-Regularized Sparse Autoencoders for LLM Safety Steering
by: Yeon, Jehyeok, et al.
Published: (2025)
by: Yeon, Jehyeok, et al.
Published: (2025)
Diagnosing Spectral Ceilings in Equivariant Neural Force Fields
by: Kim, Hyunmog
Published: (2026)
by: Kim, Hyunmog
Published: (2026)
Yet Unnoticed in LSTM: Binary Tree Based Input Reordering, Weight Regularization, and Gate Nonlinearization
by: Moattari, Mojtaba
Published: (2025)
by: Moattari, Mojtaba
Published: (2025)
MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale Deployment
by: Huang, Hanxian, et al.
Published: (2026)
by: Huang, Hanxian, et al.
Published: (2026)
A Unified Framework for LLM Watermarks
by: Gloaguen, Thibaud, et al.
Published: (2026)
by: Gloaguen, Thibaud, et al.
Published: (2026)
Atom of Thoughts for Markov LLM Test-Time Scaling
by: Teng, Fengwei, et al.
Published: (2025)
by: Teng, Fengwei, et al.
Published: (2025)
Demystifying Manifold Constraints in LLM Pre-training
by: An, Kang, et al.
Published: (2026)
by: An, Kang, et al.
Published: (2026)
Dr. Post-Training: A Data Regularization Perspective on LLM Post-Training
by: Hu, Pingbang, et al.
Published: (2026)
by: Hu, Pingbang, et al.
Published: (2026)
Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap
by: Tanna, Aditya, et al.
Published: (2026)
by: Tanna, Aditya, et al.
Published: (2026)
UltraSketchLLM: Saliency-Driven Sketching for Ultra-Low Bit LLM Compression
by: Zou, Sunan, et al.
Published: (2025)
by: Zou, Sunan, et al.
Published: (2025)
Reinforcement Learning with $ω$-Regular Objectives and Constraints
by: Wagner, Dominik, et al.
Published: (2025)
by: Wagner, Dominik, et al.
Published: (2025)
When Your LLM Reaches End-of-Life: A Framework for Confident Model Migration in Production Systems
by: Casey, Emma, et al.
Published: (2026)
by: Casey, Emma, et al.
Published: (2026)
Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization
by: Yao, Jiashu, et al.
Published: (2026)
by: Yao, Jiashu, et al.
Published: (2026)
Adaptive Layerwise Perturbation: Unifying Off-Policy Corrections for LLM RL
by: Ye, Chenlu, et al.
Published: (2026)
by: Ye, Chenlu, et al.
Published: (2026)
Unsolvability Ceiling in Multi-LLM Routing: An Empirical Study of Evaluation Artifacts
by: Garg, Saloni, et al.
Published: (2026)
by: Garg, Saloni, et al.
Published: (2026)
Scaling-Aware Adapter for Structure-Grounded LLM Reasoning
by: Jing, Zihao, et al.
Published: (2026)
by: Jing, Zihao, et al.
Published: (2026)
Astro: Activation-guided Structured Regularization for Outlier-Robust LLM Post-Training Quantization
by: Chen, Xi, et al.
Published: (2026)
by: Chen, Xi, et al.
Published: (2026)
Dynamic Intelligence Ceilings: Measuring Long-Horizon Limits of Planning and Creativity in Artificial Systems
by: Khanh, Truong Xuan, et al.
Published: (2026)
by: Khanh, Truong Xuan, et al.
Published: (2026)
Online Scheduling for LLM Inference with KV Cache Constraints
by: Jaillet, Patrick, et al.
Published: (2025)
by: Jaillet, Patrick, et al.
Published: (2025)
On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies
by: Chalumeau, Felix, et al.
Published: (2025)
by: Chalumeau, Felix, et al.
Published: (2025)
Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights
by: Parashar, Shubham, et al.
Published: (2025)
by: Parashar, Shubham, et al.
Published: (2025)
InsightBuild: LLM-Powered Causal Reasoning in Smart Building Systems
by: Neogi, Pinaki Prasad Guha, et al.
Published: (2025)
by: Neogi, Pinaki Prasad Guha, et al.
Published: (2025)
Towards Mitigating Excessive Forgetting in LLM Unlearning via Entanglement-Guidance with Proxy Constraint
by: Liu, Zhihao, et al.
Published: (2025)
by: Liu, Zhihao, et al.
Published: (2025)
LLM-Based Scientific Equation Discovery via Physics-Informed Token-Regularized Policy Optimization
by: Wang, Boxiao, et al.
Published: (2026)
by: Wang, Boxiao, et al.
Published: (2026)
SparseSwaps: Tractable LLM Pruning Mask Refinement at Scale
by: Zimmer, Max, et al.
Published: (2025)
by: Zimmer, Max, et al.
Published: (2025)
Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet
by: Zhao, James Xu, et al.
Published: (2025)
by: Zhao, James Xu, et al.
Published: (2025)
Controlled LLM Training on Spectral Sphere
by: Xie, Tian, et al.
Published: (2026)
by: Xie, Tian, et al.
Published: (2026)
Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation
by: He, Qiang, et al.
Published: (2024)
by: He, Qiang, et al.
Published: (2024)
Robust Yet Efficient Conformal Prediction Sets
by: Zargarbashi, Soroush H., et al.
Published: (2024)
by: Zargarbashi, Soroush H., et al.
Published: (2024)
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning
by: Xu, Yi, et al.
Published: (2026)
by: Xu, Yi, et al.
Published: (2026)
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
by: Liang, Zhenwen, et al.
Published: (2024)
by: Liang, Zhenwen, et al.
Published: (2024)
Similar Items
-
Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
by: Yuan, Yurun, et al.
Published: (2026) -
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
by: Li, Ang, et al.
Published: (2025) -
Ratio-Variance Regularized Policy Optimization for Efficient LLM Fine-tuning
by: Luo, Yu, et al.
Published: (2026) -
When the Domain Expert Has No Time and the LLM Developer Has No Clinical Expertise: Real-World Lessons from LLM Co-Design in a Safety-Net Hospital
by: Kothari, Avni, et al.
Published: (2025) -
Simple Yet Effective: An Information-Theoretic Approach to Multi-LLM Uncertainty Quantification
by: Kruse, Maya, et al.
Published: (2025)