:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Luo, Charles
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2412.16443
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
by: Yuan, Yurun, et al.
Published: (2026)

Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
by: Li, Ang, et al.
Published: (2025)

Ratio-Variance Regularized Policy Optimization for Efficient LLM Fine-tuning
by: Luo, Yu, et al.
Published: (2026)

When the Domain Expert Has No Time and the LLM Developer Has No Clinical Expertise: Real-World Lessons from LLM Co-Design in a Safety-Net Hospital
by: Kothari, Avni, et al.
Published: (2025)

Simple Yet Effective: An Information-Theoretic Approach to Multi-LLM Uncertainty Quantification
by: Kruse, Maya, et al.
Published: (2025)

Breaking the Blocks: Continuous Low-Rank Decomposed Scaling for Unified LLM Quantization and Adaptation
by: Tang, Pingzhi, et al.
Published: (2026)

The LLM Has Left The Chat: Evidence of Bail Preferences in Large Language Models
by: Ensign, Danielle, et al.
Published: (2025)

Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers
by: Sareen, Kusha, et al.
Published: (2025)

Graph-Regularized Sparse Autoencoders for LLM Safety Steering
by: Yeon, Jehyeok, et al.
Published: (2025)

Diagnosing Spectral Ceilings in Equivariant Neural Force Fields
by: Kim, Hyunmog
Published: (2026)

Yet Unnoticed in LSTM: Binary Tree Based Input Reordering, Weight Regularization, and Gate Nonlinearization
by: Moattari, Mojtaba
Published: (2025)

MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale Deployment
by: Huang, Hanxian, et al.
Published: (2026)

A Unified Framework for LLM Watermarks
by: Gloaguen, Thibaud, et al.
Published: (2026)

Atom of Thoughts for Markov LLM Test-Time Scaling
by: Teng, Fengwei, et al.
Published: (2025)

Demystifying Manifold Constraints in LLM Pre-training
by: An, Kang, et al.
Published: (2026)

Dr. Post-Training: A Data Regularization Perspective on LLM Post-Training
by: Hu, Pingbang, et al.
Published: (2026)

Ensembling Tabular Foundation Models - A Diversity Ceiling And A Calibration Trap
by: Tanna, Aditya, et al.
Published: (2026)

UltraSketchLLM: Saliency-Driven Sketching for Ultra-Low Bit LLM Compression
by: Zou, Sunan, et al.
Published: (2025)

Reinforcement Learning with $ω$-Regular Objectives and Constraints
by: Wagner, Dominik, et al.
Published: (2025)

When Your LLM Reaches End-of-Life: A Framework for Confident Model Migration in Production Systems
by: Casey, Emma, et al.
Published: (2026)

Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization
by: Yao, Jiashu, et al.
Published: (2026)

Adaptive Layerwise Perturbation: Unifying Off-Policy Corrections for LLM RL
by: Ye, Chenlu, et al.
Published: (2026)

Unsolvability Ceiling in Multi-LLM Routing: An Empirical Study of Evaluation Artifacts
by: Garg, Saloni, et al.
Published: (2026)

Scaling-Aware Adapter for Structure-Grounded LLM Reasoning
by: Jing, Zihao, et al.
Published: (2026)

Astro: Activation-guided Structured Regularization for Outlier-Robust LLM Post-Training Quantization
by: Chen, Xi, et al.
Published: (2026)

Dynamic Intelligence Ceilings: Measuring Long-Horizon Limits of Planning and Creativity in Artificial Systems
by: Khanh, Truong Xuan, et al.
Published: (2026)

Online Scheduling for LLM Inference with KV Cache Constraints
by: Jaillet, Patrick, et al.
Published: (2025)

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning
by: Zhang, Yifan, et al.
Published: (2025)

Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies
by: Chalumeau, Felix, et al.
Published: (2025)

Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights
by: Parashar, Shubham, et al.
Published: (2025)

InsightBuild: LLM-Powered Causal Reasoning in Smart Building Systems
by: Neogi, Pinaki Prasad Guha, et al.
Published: (2025)

Towards Mitigating Excessive Forgetting in LLM Unlearning via Entanglement-Guidance with Proxy Constraint
by: Liu, Zhihao, et al.
Published: (2025)

LLM-Based Scientific Equation Discovery via Physics-Informed Token-Regularized Policy Optimization
by: Wang, Boxiao, et al.
Published: (2026)

SparseSwaps: Tractable LLM Pruning Mask Refinement at Scale
by: Zimmer, Max, et al.
Published: (2025)

Test-Time Scaling in Reasoning Models Is Not Effective for Knowledge-Intensive Tasks Yet
by: Zhao, James Xu, et al.
Published: (2025)

Controlled LLM Training on Spectral Sphere
by: Xie, Tian, et al.
Published: (2026)

Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation
by: He, Qiang, et al.
Published: (2024)

Robust Yet Efficient Conformal Prediction Sets
by: Zargarbashi, Soroush H., et al.
Published: (2024)

The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning
by: Xu, Yi, et al.
Published: (2026)

Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification
by: Liang, Zhenwen, et al.
Published: (2024)