Saved in:
| Main Authors: | Seraj, Raihan, Meng, Lili, Sylvain, Tristan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.08759 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
by: Yan, Qi, et al.
Published: (2023)
by: Yan, Qi, et al.
Published: (2023)
Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs
by: Wu, Lili, et al.
Published: (2024)
by: Wu, Lili, et al.
Published: (2024)
Robust Reinforcement Learning Objectives for Sequential Recommender Systems
by: Mozifian, Melissa, et al.
Published: (2023)
by: Mozifian, Melissa, et al.
Published: (2023)
Heterogeneous Decentralized Diffusion Models
by: Jiang, Zhiying, et al.
Published: (2026)
by: Jiang, Zhiying, et al.
Published: (2026)
Spectral bandits
by: Kocák, Tomáš, et al.
Published: (2026)
by: Kocák, Tomáš, et al.
Published: (2026)
Linear bandits with polylogarithmic minimax regret
by: Lumbreras, Josep, et al.
Published: (2024)
by: Lumbreras, Josep, et al.
Published: (2024)
Adversarial bandit optimization for approximately linear functions
by: Cheng, Zhuoyu, et al.
Published: (2025)
by: Cheng, Zhuoyu, et al.
Published: (2025)
PcLast: Discovering Plannable Continuous Latent States
by: Koul, Anurag, et al.
Published: (2023)
by: Koul, Anurag, et al.
Published: (2023)
Reinforcement learning with combinatorial actions for coupled restless bandits
by: Xu, Lily, et al.
Published: (2025)
by: Xu, Lily, et al.
Published: (2025)
Rejecting Hallucinated State Targets during Planning
by: Zhao, Mingde, et al.
Published: (2024)
by: Zhao, Mingde, et al.
Published: (2024)
Towards In-Vehicle Multi-Task Facial Attribute Recognition: Investigating Synthetic Data and Vision Foundation Models
by: Seraj, Esmaeil, et al.
Published: (2024)
by: Seraj, Esmaeil, et al.
Published: (2024)
Can MLLMs generate human-like feedback in grading multimodal short answers?
by: Sil, Pritam, et al.
Published: (2024)
by: Sil, Pritam, et al.
Published: (2024)
Functional multi-armed bandit and the best function identification problems
by: Dorn, Yuriy, et al.
Published: (2025)
by: Dorn, Yuriy, et al.
Published: (2025)
Maximizing the efficiency of human feedback in AI alignment: a comparative analysis
by: Chouliaras, Andreas, et al.
Published: (2025)
by: Chouliaras, Andreas, et al.
Published: (2025)
Learning to summarize user information for personalized reinforcement learning from human feedback
by: Nam, Hyunji, et al.
Published: (2025)
by: Nam, Hyunji, et al.
Published: (2025)
Improving Reliable Navigation under Uncertainty via Predictions Informed by Non-Local Information
by: Arnob, Raihan Islam, et al.
Published: (2023)
by: Arnob, Raihan Islam, et al.
Published: (2023)
Softmax gradient policy for variance minimization and risk-averse multi armed bandits
by: Turinici, Gabriel
Published: (2026)
by: Turinici, Gabriel
Published: (2026)
Realtime Dynamic Gaze Target Tracking and Depth-Level Estimation
by: Seraj, Esmaeil, et al.
Published: (2024)
by: Seraj, Esmaeil, et al.
Published: (2024)
PsyAgent: Constructing Human-like Agents Based on Psychological Modeling and Contextual Interaction
by: Meng, Zibin, et al.
Published: (2026)
by: Meng, Zibin, et al.
Published: (2026)
False Data Injection Attack Detection in Edge-based Smart Metering Networks with Federated Learning
by: Uddin, Md Raihan, et al.
Published: (2024)
by: Uddin, Md Raihan, et al.
Published: (2024)
Optimizing Alignment with Less: Leveraging Data Augmentation for Personalized Evaluation
by: Seraj, Javad, et al.
Published: (2024)
by: Seraj, Javad, et al.
Published: (2024)
MBExplainer: Multilevel bandit-based explanations for downstream models with augmented graph embeddings
by: Golgoon, Ashkan, et al.
Published: (2024)
by: Golgoon, Ashkan, et al.
Published: (2024)
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
by: Lin, Xiaoqiang, et al.
Published: (2023)
by: Lin, Xiaoqiang, et al.
Published: (2023)
Eterna is Solved
by: Cazenave, Tristan
Published: (2025)
by: Cazenave, Tristan
Published: (2025)
Monte Carlo Search Algorithms Discovering Monte Carlo Tree Search Exploration Terms
by: Cazenave, Tristan
Published: (2024)
by: Cazenave, Tristan
Published: (2024)
Generalized Nested Rollout Policy Adaptation with Limited Repetitions
by: Cazenave, Tristan
Published: (2024)
by: Cazenave, Tristan
Published: (2024)
Learning a Prior for Monte Carlo Search by Replaying Solutions to Combinatorial Problems
by: Cazenave, Tristan
Published: (2024)
by: Cazenave, Tristan
Published: (2024)
Silencing the Guardrails: Inference-Time Jailbreaking via Dynamic Contextual Representation Ablation
by: Xing, Wenpeng, et al.
Published: (2026)
by: Xing, Wenpeng, et al.
Published: (2026)
Sample-based Dynamic Hierarchical Transformer with Layer and Head Flexibility via Contextual Bandit
by: Meng, Fanfei, et al.
Published: (2023)
by: Meng, Fanfei, et al.
Published: (2023)
Prior-informed optimization of treatment recommendation via bandit algorithms trained on large language model-processed historical records
by: Nessari, Saman, et al.
Published: (2025)
by: Nessari, Saman, et al.
Published: (2025)
Interpretable experiential learning based on state history and global feedback
by: Kolonin, Anton
Published: (2026)
by: Kolonin, Anton
Published: (2026)
BeeRNA: tertiary structure-based RNA inverse folding using Artificial Bee Colony
by: Mlaweh, Mehyar, et al.
Published: (2025)
by: Mlaweh, Mehyar, et al.
Published: (2025)
Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection
by: Paudel, Abhishek, et al.
Published: (2026)
by: Paudel, Abhishek, et al.
Published: (2026)
AIoT-based Continuous, Contextualized, and Explainable Driving Assessment for Older Adults
by: Liu, Yimeng, et al.
Published: (2026)
by: Liu, Yimeng, et al.
Published: (2026)
Safety through feedback in Constrained RL
by: Chirra, Shashank Reddy, et al.
Published: (2024)
by: Chirra, Shashank Reddy, et al.
Published: (2024)
Enhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling
by: Mostafa, Seraj Al Mahmud, et al.
Published: (2025)
by: Mostafa, Seraj Al Mahmud, et al.
Published: (2025)
QD-VMR: Query Debiasing with Contextual Understanding Enhancement for Video Moment Retrieval
by: Gao, Chenghua, et al.
Published: (2024)
by: Gao, Chenghua, et al.
Published: (2024)
AI-Enabled grading with near-domain data for scaling feedback with human-level accuracy
by: Agarwal, Shyam, et al.
Published: (2025)
by: Agarwal, Shyam, et al.
Published: (2025)
Monte Carlo Permutation Search
by: Cazenave, Tristan
Published: (2025)
by: Cazenave, Tristan
Published: (2025)
Reinforcement learning for question answering in programming domain using public community scoring as a human feedback
by: Gorbatovski, Alexey, et al.
Published: (2024)
by: Gorbatovski, Alexey, et al.
Published: (2024)
Similar Items
-
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
by: Yan, Qi, et al.
Published: (2023) -
Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs
by: Wu, Lili, et al.
Published: (2024) -
Robust Reinforcement Learning Objectives for Sequential Recommender Systems
by: Mozifian, Melissa, et al.
Published: (2023) -
Heterogeneous Decentralized Diffusion Models
by: Jiang, Zhiying, et al.
Published: (2026) -
Spectral bandits
by: Kocák, Tomáš, et al.
Published: (2026)