:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yu, Tianyu, Zhang, Haoye, Li, Qiming, Xu, Qixin, Yao, Yuan, Chen, Da, Lu, Xiaoman, Cui, Ganqu, Dang, Yunkai, He, Taiwen, Feng, Xiaocheng, Song, Jun, Zheng, Bo, Liu, Zhiyuan, Chua, Tat-Seng, Sun, Maosong
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2405.17220
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
by: Yu, Tianyu, et al.
Published: (2023)

RLPR: Extrapolating RLVR to General Domains without Verifiers
by: Yu, Tianyu, et al.
Published: (2025)

NExT-GPT: Any-to-Any Multimodal LLM
by: Wu, Shengqiong, et al.
Published: (2023)

Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art
by: Jin, Zhe, et al.
Published: (2025)

MiniCPM-V: A GPT-4V Level MLLM on Your Phone
by: Yao, Yuan, et al.
Published: (2024)

Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
by: Chen, Yiyang, et al.
Published: (2022)

Understanding Long Videos via LLM-Powered Entity Relation Graphs
by: Chu, Meng, et al.
Published: (2025)

Universal Scene Graph Generation
by: Wu, Shengqiong, et al.
Published: (2025)

Learning to Ask Critical Questions for Assisting Product Search
by: Li, Zixuan, et al.
Published: (2024)

Offline RLAIF: Piloting VLM Feedback for RL via SFO
by: Beck, Jacob
Published: (2025)

UltraFeedback: Boosting Language Models with Scaled AI Feedback
by: Cui, Ganqu, et al.
Published: (2023)

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
by: Xu, Ruyi, et al.
Published: (2024)

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
by: Lee, Harrison, et al.
Published: (2023)

Thinking with Blueprints: Assisting Vision-Language Models in Spatial Reasoning via Structured Object Representation
by: Ma, Weijian, et al.
Published: (2026)

Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback
by: Lin, Jiaye, et al.
Published: (2025)

Towards Goal-oriented Intelligent Tutoring Systems in Online Education
by: Deng, Yang, et al.
Published: (2023)

Disentangling Masked Autoencoders for Unsupervised Domain Generalization
by: Zhang, An, et al.
Published: (2024)

Length Controlled Generation for Black-box LLMs
by: Gu, Yuxuan, et al.
Published: (2024)

RLAIF-SPA: Structured AI Feedback for Semantic-Prosodic Alignment in Speech Synthesis
by: Yang, Qing, et al.
Published: (2025)

ProtT3: Protein-to-Text Generation for Text-based Protein Understanding
by: Liu, Zhiyuan, et al.
Published: (2024)

Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules
by: Liu, Zhiyuan, et al.
Published: (2023)

Can I Trust Your Answer? Visually Grounded Video Question Answering
by: Xiao, Junbin, et al.
Published: (2023)

Contrastive Pre-training for Deep Session Data Understanding
by: Li, Zixuan, et al.
Published: (2024)

Turing Patterns for Multimedia: Reaction-Diffusion Multi-Modal Fusion for Language-Guided Video Moment Retrieval
by: Fang, Xiang, et al.
Published: (2026)

Extending Visual Dynamics for Video-to-Music Generation
by: Liu, Xiaohao, et al.
Published: (2025)

Enhancing Spectral Graph Neural Networks with LLM-Predicted Homophily
by: Lu, Kangkang, et al.
Published: (2025)

3D-TAFS: A Training-free Framework for 3D Affordance Segmentation
by: Chu, Meng, et al.
Published: (2024)

Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion
by: Zhou, Zhenglin, et al.
Published: (2025)

XNLP: An Interactive Demonstration System for Universal Structured NLP
by: Fei, Hao, et al.
Published: (2023)

A Survey on Neural Question Generation: Methods, Applications, and Prospects
by: Guo, Shasha, et al.
Published: (2024)

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
by: Zhi, Peiyuan, et al.
Published: (2024)

NExT-Search: Rebuilding User Feedback Ecosystem for Generative AI Search
by: Dai, Sunhao, et al.
Published: (2025)

Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model
by: Shen, Fei, et al.
Published: (2025)

An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes
by: Qi, Ji, et al.
Published: (2025)

Principled Multimodal Representation Learning
by: Liu, Xiaohao, et al.
Published: (2025)

On Generative Agents in Recommendation
by: Zhang, An, et al.
Published: (2023)

LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation
by: He, Yingzhi, et al.
Published: (2025)

Continual Multimodal Contrastive Learning
by: Liu, Xiaohao, et al.
Published: (2025)

Aligning Large Language Models for Faithful Integrity Against Opposing Argument
by: Zhao, Yong, et al.
Published: (2025)

Inverting the wedge map and Gauss composition
by: Chua, Kok Seng
Published: (2024)