:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	An, Zhiyu, Du, Wan
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2602.03003
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model
by: An, Zhiyu, et al.
Published: (2026)

Beyond Noisy-TVs: Noise-Robust Exploration Via Learning Progress Monitoring
by: Hou, Zhibo, et al.
Published: (2025)

Go Beyond Black-box Policies: Rethinking the Design of Learning Agent for Interpretable and Verifiable HVAC Control
by: An, Zhiyu, et al.
Published: (2024)

MoralReason: Generalizable Moral Decision Alignment For LLM Agents Using Reasoning-Level Reinforcement Learning
by: An, Zhiyu, et al.
Published: (2025)

The Autonomy-Alignment Problem in Open-Ended Learning Robots: Formalising the Purpose Framework
by: Baldassarre, Gianluca, et al.
Published: (2024)

DIML: Differentiable Inverse Mechanism Learning from Behaviors of Multi-Agent Learning Trajectories
by: An, Zhiyu, et al.
Published: (2026)

The Sign Estimator: LLM Alignment in the Face of Choice Heterogeneity
by: Aouad, Ali, et al.
Published: (2025)

Representative Social Choice: From Learning Theory to AI Alignment
by: Qiu, Tianyi
Published: (2024)

The Alignment Problem from a Deep Learning Perspective
by: Ngo, Richard, et al.
Published: (2022)

Reinforcement Learning with Pairwise Preferences in Long-Term Decision Problems
by: Carr, Jonathan Colaço, et al.
Published: (2026)

Solver-Free Decision-Focused Learning for Linear Optimization Problems
by: Berden, Senne, et al.
Published: (2025)

Beyond Pairwise: Empowering LLM Alignment With Ranked Choice Modeling
by: Tang, Yuxuan, et al.
Published: (2025)

AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
by: Wu, Peilin, et al.
Published: (2026)

Minimizing Surrogate Losses for Decision-Focused Learning using Differentiable Optimization
by: Mandi, Jayanta, et al.
Published: (2025)

FOUNDER: Grounding Foundation Models in World Models for Open-Ended Embodied Decision Making
by: Wang, Yucen, et al.
Published: (2025)

An Unsupervised Learning Framework Combined with Heuristics for the Maximum Minimal Cut Problem
by: Liu, Huaiyuan, et al.
Published: (2024)

Adversarial Preference Learning for Robust LLM Alignment
by: Wang, Yuanfu, et al.
Published: (2025)

Structure in Deep Reinforcement Learning: A Survey and Open Problems
by: Mohan, Aditya, et al.
Published: (2023)

Approximation-Free Differentiable Oblique Decision Trees
by: Panda, Subrat Prasad, et al.
Published: (2026)

Generative Social Choice
by: Fish, Sara, et al.
Published: (2023)

AllMatch: Exploiting All Unlabeled Data for Semi-Supervised Learning
by: Wu, Zhiyu, et al.
Published: (2024)

Can Differentiable Decision Trees Enable Interpretable Reward Learning from Human Feedback?
by: Kalra, Akansha, et al.
Published: (2023)

Demystifying the Physics of Deep Reinforcement Learning-Based Autonomous Vehicle Decision-Making
by: Wan, Hanxi, et al.
Published: (2024)

Almost Sure Convergence of Differential Temporal Difference Learning for Average Reward Markov Decision Processes
by: Blaser, Ethan, et al.
Published: (2026)

Theoretical Tensions in RLHF: Reconciling Empirical Success with Inconsistencies in Social Choice Theory
by: Xiao, Jiancong, et al.
Published: (2025)

Federated Graph Semantic and Structural Learning
by: Huang, Wenke, et al.
Published: (2024)

TRACE: Distilling Where It Matters via Token-Routed Self On-Policy Alignment
by: Wang, Jiaxuan, et al.
Published: (2026)

Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement Learning
by: Muppidi, Aneesh, et al.
Published: (2024)

The Importance of Architecture Choice in Deep Learning for Climate Applications
by: Dräger, Simon, et al.
Published: (2024)

GEAR: A General Evaluation Framework for Abductive Reasoning
by: He, Kaiyu, et al.
Published: (2025)

HiPRAG: Hierarchical Process Rewards for Efficient Agentic Retrieval Augmented Generation
by: Wu, Peilin, et al.
Published: (2025)

The Ungrounded Alignment Problem
by: Pickett, Marc, et al.
Published: (2024)

Pattern Recognition or Medical Knowledge? The Problem with Multiple-Choice Questions in Medicine
by: Griot, Maxime, et al.
Published: (2024)

Generative Social Choice: The Next Generation
by: Boehmer, Niclas, et al.
Published: (2025)

DiffTORI: Differentiable Trajectory Optimization for Deep Reinforcement and Imitation Learning
by: Wan, Weikang, et al.
Published: (2024)

Adaptive Sparse Allocation with Mutual Choice & Feature Choice Sparse Autoencoders
by: Ayonrinde, Kola
Published: (2024)

Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning
by: Li, Yichen, et al.
Published: (2025)

Manifold Approximation leads to Robust Kernel Alignment
by: Islam, Mohammad Tariqul, et al.
Published: (2025)

SocialJax: An Evaluation Suite for Multi-agent Reinforcement Learning in Sequential Social Dilemmas
by: Guo, Zihao, et al.
Published: (2025)

Cooperative Open-ended Learning Framework for Zero-shot Coordination
by: Li, Yang, et al.
Published: (2023)