:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Seraj, Raihan, Meng, Lili, Sylvain, Tristan
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2502.08759
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
by: Yan, Qi, et al.
Published: (2023)

Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs
by: Wu, Lili, et al.
Published: (2024)

Robust Reinforcement Learning Objectives for Sequential Recommender Systems
by: Mozifian, Melissa, et al.
Published: (2023)

Heterogeneous Decentralized Diffusion Models
by: Jiang, Zhiying, et al.
Published: (2026)

Spectral bandits
by: Kocák, Tomáš, et al.
Published: (2026)

Linear bandits with polylogarithmic minimax regret
by: Lumbreras, Josep, et al.
Published: (2024)

Adversarial bandit optimization for approximately linear functions
by: Cheng, Zhuoyu, et al.
Published: (2025)

PcLast: Discovering Plannable Continuous Latent States
by: Koul, Anurag, et al.
Published: (2023)

Reinforcement learning with combinatorial actions for coupled restless bandits
by: Xu, Lily, et al.
Published: (2025)

Rejecting Hallucinated State Targets during Planning
by: Zhao, Mingde, et al.
Published: (2024)

Towards In-Vehicle Multi-Task Facial Attribute Recognition: Investigating Synthetic Data and Vision Foundation Models
by: Seraj, Esmaeil, et al.
Published: (2024)

Can MLLMs generate human-like feedback in grading multimodal short answers?
by: Sil, Pritam, et al.
Published: (2024)

Functional multi-armed bandit and the best function identification problems
by: Dorn, Yuriy, et al.
Published: (2025)

Maximizing the efficiency of human feedback in AI alignment: a comparative analysis
by: Chouliaras, Andreas, et al.
Published: (2025)

Learning to summarize user information for personalized reinforcement learning from human feedback
by: Nam, Hyunji, et al.
Published: (2025)

Improving Reliable Navigation under Uncertainty via Predictions Informed by Non-Local Information
by: Arnob, Raihan Islam, et al.
Published: (2023)

Softmax gradient policy for variance minimization and risk-averse multi armed bandits
by: Turinici, Gabriel
Published: (2026)

Realtime Dynamic Gaze Target Tracking and Depth-Level Estimation
by: Seraj, Esmaeil, et al.
Published: (2024)

PsyAgent: Constructing Human-like Agents Based on Psychological Modeling and Contextual Interaction
by: Meng, Zibin, et al.
Published: (2026)

False Data Injection Attack Detection in Edge-based Smart Metering Networks with Federated Learning
by: Uddin, Md Raihan, et al.
Published: (2024)

Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation
by: Seraj, Javad, et al.
Published: (2024)

MBExplainer: Multilevel bandit-based explanations for downstream models with augmented graph embeddings
by: Golgoon, Ashkan, et al.
Published: (2024)

Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
by: Lin, Xiaoqiang, et al.
Published: (2023)

Eterna is Solved
by: Cazenave, Tristan
Published: (2025)

Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search Exploration Terms
by: Cazenave, Tristan
Published: (2024)

Generalized Nested Rollout Policy Adaptation with Limited Repetitions
by: Cazenave, Tristan
Published: (2024)

Learning a Prior for Monte Carlo Search by Replaying Solutions to Combinatorial Problems
by: Cazenave, Tristan
Published: (2024)

Silencing the Guardrails: Inference-Time Jailbreaking via Dynamic Contextual Representation Ablation
by: Xing, Wenpeng, et al.
Published: (2026)

Sample-based Dynamic Hierarchical Transformer with Layer and Head Flexibility via Contextual Bandit
by: Meng, Fanfei, et al.
Published: (2023)

Prior-informed optimization of treatment recommendation via bandit algorithms trained on large language model-processed historical records
by: Nessari, Saman, et al.
Published: (2025)

Interpretable experiential learning based on state history and global feedback
by: Kolonin, Anton
Published: (2026)

BeeRNA: tertiary structure-based RNA inverse folding using Artificial Bee Colony
by: Mlaweh, Mehyar, et al.
Published: (2025)

Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection
by: Paudel, Abhishek, et al.
Published: (2026)

AIoT-based Continuous, Contextualized, and Explainable Driving Assessment for Older Adults
by: Liu, Yimeng, et al.
Published: (2026)

Safety through feedback in Constrained RL
by: Chirra, Shashank Reddy, et al.
Published: (2024)

Enhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling
by: Mostafa, Seraj Al Mahmud, et al.
Published: (2025)

QD-VMR: Query Debiasing with Contextual Understanding Enhancement for Video Moment Retrieval
by: Gao, Chenghua, et al.
Published: (2024)

AI-Enabled grading with near-domain data for scaling feedback with human-level accuracy
by: Agarwal, Shyam, et al.
Published: (2025)

Monte Carlo Permutation Search
by: Cazenave, Tristan
Published: (2025)

Reinforcement learning for question answering in programming domain using public community scoring as a human feedback
by: Gorbatovski, Alexey, et al.
Published: (2024)