:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xia, Yu, Kong, Fang, Yu, Tong, Guo, Liya, Rossi, Ryan A., Kim, Sungchul, Li, Shuai
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2403.07213
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Hallucination Diversity-Aware Active Learning for Text Summarization
by: Xia, Yu, et al.
Published: (2024)

Online Multi-LLM Selection via Contextual Bandits under Unstructured Context Evolution
by: Poon, Manhin, et al.
Published: (2025)

Towards Improving Long-Tail Entity Predictions in Temporal Knowledge Graphs through Global Similarity and Weighted Sampling
by: Mirtaheri, Mehrnoosh, et al.
Published: (2025)

Improved Bandits in Many-to-one Matching Markets with Incentive Compatibility
by: Kong, Fang, et al.
Published: (2024)

A Multi-LLM Debiasing Framework
by: Owens, Deonna M., et al.
Published: (2024)

Finite-Time Regret Analysis of Retry-Aware Bandits
by: Tong, Bingkui, et al.
Published: (2026)

MAGNET: Autonomous Expert Model Generation via Decentralized Autoresearch and BitNet Training
by: Kim, Yongwan, et al.
Published: (2026)

Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent
by: Wu, Junda, et al.
Published: (2025)

Federated Large Language Models: Current Progress and Future Directions
by: Yao, Yuhang, et al.
Published: (2024)

Self-Debiasing Large Language Models: Zero-Shot Recognition and Reduction of Stereotypes
by: Gallegos, Isabel O., et al.
Published: (2024)

Bias and Fairness in Large Language Models: A Survey
by: Gallegos, Isabel O., et al.
Published: (2023)

PAK-UCB Contextual Bandit: An Online Learning Approach to Prompt-Aware Selection of Generative Models and LLMs
by: Hu, Xiaoyan, et al.
Published: (2024)

Visual Prompting in Multimodal Large Language Models: A Survey
by: Wu, Junda, et al.
Published: (2024)

Online Clustering of Dueling Bandits
by: Wang, Zhiyong, et al.
Published: (2025)

Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback
by: Lee, Seohyun, et al.
Published: (2026)

From Selection to Generation: A Survey of LLM-based Active Learning
by: Xia, Yu, et al.
Published: (2025)

A Multi-Armed Bandit Approach to Online Selection and Evaluation of Generative Models
by: Hu, Xiaoyan, et al.
Published: (2024)

Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval
by: Xia, Yu, et al.
Published: (2024)

Direction-Aware Offline-to-Online Learning in Linear Contextual Bandits
by: Han, Zean, et al.
Published: (2026)

Multi-Play Combinatorial Semi-Bandit Problem
by: Nakamura, Shintaro, et al.
Published: (2025)

Efficient Model Selection for Time Series Forecasting via LLMs
by: Wei, Wang, et al.
Published: (2025)

Optimizing Data Delivery: Insights from User Preferences on Visuals, Tables, and Text
by: Luera, Reuben, et al.
Published: (2024)

Training Robust Graph Neural Networks by Modeling Noise Dependencies
by: In, Yeonjun, et al.
Published: (2025)

Causal Discovery in Semi-Stationary Time Series
by: Gao, Shanyun, et al.
Published: (2024)

Multi-Agent Collaborative Filtering: Orchestrating Users and Items for Agentic Recommendations
by: Xia, Yu, et al.
Published: (2025)

Hybrid Combinatorial Multi-armed Bandits with Probabilistically Triggered Arms
by: Zhou, Kongchang, et al.
Published: (2025)

Play Style Identification Using Low-Level Representations of Play Traces in MicroRTS
by: Xia, Ruizhe Yu, et al.
Published: (2025)

Causal Discovery-Driven Change Point Detection in Time Series
by: Gao, Shanyun, et al.
Published: (2024)

KernelBand: Steering LLM-based Kernel Optimization via Hardware-Aware Multi-Armed Bandits
by: Ran, Dezhi, et al.
Published: (2025)

Calibration-Gated LLM Pseudo-Observations for Online Contextual Bandits
by: Pershin, Maksim, et al.
Published: (2026)

Offline-to-Online Reinforcement Learning with Classifier-Free Diffusion Generation
by: Huang, Xiao, et al.
Published: (2025)

TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment
by: Tan, Zhewen, et al.
Published: (2026)

A Provably Convergent Plug-and-Play Framework for Stochastic Bilevel Optimization
by: Chu, Tianshu, et al.
Published: (2025)

Adversarial Bandit over Bandits: Hierarchical Bandits for Online Configuration Management
by: Avin, Chen, et al.
Published: (2025)

Cost-Effective Online Multi-LLM Selection with Versatile Reward Models
by: Dai, Xiangxiang, et al.
Published: (2024)

A Federated Online Restless Bandit Framework for Cooperative Resource Allocation
by: Tong, Jingwen, et al.
Published: (2024)

Prompt2Fingerprint: Plug-and-Play LLM Fingerprinting via Text-to-Weight Generation
by: Chen, Sixu, et al.
Published: (2026)

Provably Convergent Primal-Dual DPO for Constrained LLM Alignment
by: Du, Yihan, et al.
Published: (2025)

Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs
by: Liu, Hongyi, et al.
Published: (2025)

Global Rewards in Restless Multi-Armed Bandits
by: Raman, Naveen, et al.
Published: (2024)