Saved in:
| Main Authors: | Matrenok, Simon, Moalla, Skander, Gulcehre, Caglar |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.08068 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
by: Moalla, Skander, et al.
Published: (2024)
by: Moalla, Skander, et al.
Published: (2024)
Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
by: Wei, Xiuying, et al.
Published: (2024)
by: Wei, Xiuying, et al.
Published: (2024)
Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
by: Wei, Xiuying, et al.
Published: (2024)
by: Wei, Xiuying, et al.
Published: (2024)
Partition Generative Modeling: Masked Modeling Without Masks
by: Deschenaux, Justin, et al.
Published: (2025)
by: Deschenaux, Justin, et al.
Published: (2025)
In Search for Architectures and Loss Functions in Multi-Objective Reinforcement Learning
by: Terekhov, Mikhail, et al.
Published: (2024)
by: Terekhov, Mikhail, et al.
Published: (2024)
Augmenting Attention with Exponentially Decaying Memory Improves Query-Aware KV Sparsity
by: Wei, Xiuying, et al.
Published: (2026)
by: Wei, Xiuying, et al.
Published: (2026)
BlockGen: Flexible Blockwise Sequence Modeling with Hybrid Samplers
by: Deschenaux, Justin, et al.
Published: (2026)
by: Deschenaux, Justin, et al.
Published: (2026)
RAT+: Train Dense, Infer Sparse -- Recurrence Augmented Attention for Dilated Inference
by: Wei, Xiuying, et al.
Published: (2026)
by: Wei, Xiuying, et al.
Published: (2026)
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
by: Deschenaux, Justin, et al.
Published: (2024)
by: Deschenaux, Justin, et al.
Published: (2024)
Python Machine Learning Research Template
by: Moalla, Skander
Published: (2025)
by: Moalla, Skander
Published: (2025)
Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards
by: Karzanov, Daniil, et al.
Published: (2025)
by: Karzanov, Daniil, et al.
Published: (2025)
The Role of Deep Learning Regularizations on Actors in Offline RL
by: Tarasov, Denis, et al.
Published: (2024)
by: Tarasov, Denis, et al.
Published: (2024)
The Diffusion Duality, Chapter II: $Ψ$-Samplers
by: Deschenaux, Justin, et al.
Published: (2026)
by: Deschenaux, Justin, et al.
Published: (2026)
Quantile Regression for Distributional Reward Models in RLHF
by: Dorka, Nicolai
Published: (2024)
by: Dorka, Nicolai
Published: (2024)
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall
by: Jo, Mingyu, et al.
Published: (2025)
by: Jo, Mingyu, et al.
Published: (2025)
HiPPO-Prophecy: State-Space Models can Provably Learn Dynamical Systems in Context
by: Joseph, Federico Arangath, et al.
Published: (2024)
by: Joseph, Federico Arangath, et al.
Published: (2024)
Value-Free Policy Optimization via Reward Partitioning
by: Faye, Bilal, et al.
Published: (2025)
by: Faye, Bilal, et al.
Published: (2025)
Control Tax: The Price of Keeping AI in Check
by: Terekhov, Mikhail, et al.
Published: (2025)
by: Terekhov, Mikhail, et al.
Published: (2025)
Simple Hierarchical Planning with Diffusion
by: Chen, Chang, et al.
Published: (2024)
by: Chen, Chang, et al.
Published: (2024)
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues
by: Orvieto, Antonio, et al.
Published: (2023)
by: Orvieto, Antonio, et al.
Published: (2023)
Distributional Off-Policy Evaluation with Deep Quantile Process Regression
by: Kuang, Qi, et al.
Published: (2026)
by: Kuang, Qi, et al.
Published: (2026)
PlanDQ: Hierarchical Plan Orchestration via D-Conductor and Q-Performer
by: Chen, Chang, et al.
Published: (2024)
by: Chen, Chang, et al.
Published: (2024)
Boosting CVaR Policy Optimization with Quantile Gradients
by: Luo, Yudong, et al.
Published: (2026)
by: Luo, Yudong, et al.
Published: (2026)
Vector Quantile Regression on Manifolds
by: Pegoraro, Marco, et al.
Published: (2023)
by: Pegoraro, Marco, et al.
Published: (2023)
An Exact Pointwise Characterization for Total Variation Denoising in Quantile Regression
by: Ghoshal, Deep, et al.
Published: (2026)
by: Ghoshal, Deep, et al.
Published: (2026)
Self-Recognition in Language Models
by: Davidson, Tim R., et al.
Published: (2024)
by: Davidson, Tim R., et al.
Published: (2024)
Multi-Fidelity Quantile Regression
by: Liu, Yixiang, et al.
Published: (2026)
by: Liu, Yixiang, et al.
Published: (2026)
Quantile Q-Learning: Revisiting Offline Extreme Q-Learning with Quantile Regression
by: Gao, Xinming, et al.
Published: (2025)
by: Gao, Xinming, et al.
Published: (2025)
An Efficient Multi Quantile Regression Network with Ad Hoc Prevention of Quantile Crossing
by: Decke, Jens, et al.
Published: (2024)
by: Decke, Jens, et al.
Published: (2024)
Symbolic Quantile Regression for the Interpretable Prediction of Conditional Quantiles
by: Hoekstra, Cas Oude, et al.
Published: (2025)
by: Hoekstra, Cas Oude, et al.
Published: (2025)
Fleet of Agents: Coordinated Problem Solving with Large Language Models
by: Klein, Lars, et al.
Published: (2024)
by: Klein, Lars, et al.
Published: (2024)
Horseshoe Prior Bayesian Quantile Regression
by: Kohns, David, et al.
Published: (2020)
by: Kohns, David, et al.
Published: (2020)
On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation
by: Cattaneo, Matias D., et al.
Published: (2022)
by: Cattaneo, Matias D., et al.
Published: (2022)
On Learning the Tail Quantiles of Driving Behavior Distributions via Quantile Regression and Flows
by: Tee, Jia Yu, et al.
Published: (2023)
by: Tee, Jia Yu, et al.
Published: (2023)
Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols
by: Terekhov, Mikhail, et al.
Published: (2025)
by: Terekhov, Mikhail, et al.
Published: (2025)
Online Distributionally Robust LLM Alignment via Regression to Relative Reward
by: Sahu, Sharan, et al.
Published: (2025)
by: Sahu, Sharan, et al.
Published: (2025)
Interpretable Quantile Regression by Optimal Decision Trees
by: Lemaire, Valentin, et al.
Published: (2026)
by: Lemaire, Valentin, et al.
Published: (2026)
ReModels: Quantile Regression Averaging models
by: Zakrzewski, Grzegorz, et al.
Published: (2024)
by: Zakrzewski, Grzegorz, et al.
Published: (2024)
IRPM: Intergroup Relative Preference Modeling for Pointwise Generative Reward Models
by: Song, Haonan, et al.
Published: (2026)
by: Song, Haonan, et al.
Published: (2026)
Integrating Uncertainty Awareness into Conformalized Quantile Regression
by: Rossellini, Raphael, et al.
Published: (2023)
by: Rossellini, Raphael, et al.
Published: (2023)
Similar Items
-
No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
by: Moalla, Skander, et al.
Published: (2024) -
Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers
by: Wei, Xiuying, et al.
Published: (2024) -
Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis
by: Wei, Xiuying, et al.
Published: (2024) -
Partition Generative Modeling: Masked Modeling Without Masks
by: Deschenaux, Justin, et al.
Published: (2025) -
In Search for Architectures and Loss Functions in Multi-Objective Reinforcement Learning
by: Terekhov, Mikhail, et al.
Published: (2024)