Saved in:
| Main Authors: | Yu, Tianyu, Zhang, Haoye, Li, Qiming, Xu, Qixin, Yao, Yuan, Chen, Da, Lu, Xiaoman, Cui, Ganqu, Dang, Yunkai, He, Taiwen, Feng, Xiaocheng, Song, Jun, Zheng, Bo, Liu, Zhiyuan, Chua, Tat-Seng, Sun, Maosong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.17220 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
by: Yu, Tianyu, et al.
Published: (2023)
by: Yu, Tianyu, et al.
Published: (2023)
RLPR: Extrapolating RLVR to General Domains without Verifiers
by: Yu, Tianyu, et al.
Published: (2025)
by: Yu, Tianyu, et al.
Published: (2025)
NExT-GPT: Any-to-Any Multimodal LLM
by: Wu, Shengqiong, et al.
Published: (2023)
by: Wu, Shengqiong, et al.
Published: (2023)
Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art
by: Jin, Zhe, et al.
Published: (2025)
by: Jin, Zhe, et al.
Published: (2025)
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
by: Yao, Yuan, et al.
Published: (2024)
by: Yao, Yuan, et al.
Published: (2024)
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
by: Chen, Yiyang, et al.
Published: (2022)
by: Chen, Yiyang, et al.
Published: (2022)
Understanding Long Videos via LLM-Powered Entity Relation Graphs
by: Chu, Meng, et al.
Published: (2025)
by: Chu, Meng, et al.
Published: (2025)
Universal Scene Graph Generation
by: Wu, Shengqiong, et al.
Published: (2025)
by: Wu, Shengqiong, et al.
Published: (2025)
Learning to Ask Critical Questions for Assisting Product Search
by: Li, Zixuan, et al.
Published: (2024)
by: Li, Zixuan, et al.
Published: (2024)
Offline RLAIF: Piloting VLM Feedback for RL via SFO
by: Beck, Jacob
Published: (2025)
by: Beck, Jacob
Published: (2025)
UltraFeedback: Boosting Language Models with Scaled AI Feedback
by: Cui, Ganqu, et al.
Published: (2023)
by: Cui, Ganqu, et al.
Published: (2023)
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
by: Xu, Ruyi, et al.
Published: (2024)
by: Xu, Ruyi, et al.
Published: (2024)
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
by: Lee, Harrison, et al.
Published: (2023)
by: Lee, Harrison, et al.
Published: (2023)
Thinking with Blueprints: Assisting Vision-Language Models in Spatial Reasoning via Structured Object Representation
by: Ma, Weijian, et al.
Published: (2026)
by: Ma, Weijian, et al.
Published: (2026)
Curriculum-RLAIF: Curriculum Alignment with Reinforcement Learning from AI Feedback
by: Lin, Jiaye, et al.
Published: (2025)
by: Lin, Jiaye, et al.
Published: (2025)
Towards Goal-oriented Intelligent Tutoring Systems in Online Education
by: Deng, Yang, et al.
Published: (2023)
by: Deng, Yang, et al.
Published: (2023)
Disentangling Masked Autoencoders for Unsupervised Domain Generalization
by: Zhang, An, et al.
Published: (2024)
by: Zhang, An, et al.
Published: (2024)
Length Controlled Generation for Black-box LLMs
by: Gu, Yuxuan, et al.
Published: (2024)
by: Gu, Yuxuan, et al.
Published: (2024)
RLAIF-SPA: Structured AI Feedback for Semantic-Prosodic Alignment in Speech Synthesis
by: Yang, Qing, et al.
Published: (2025)
by: Yang, Qing, et al.
Published: (2025)
ProtT3: Protein-to-Text Generation for Text-based Protein Understanding
by: Liu, Zhiyuan, et al.
Published: (2024)
by: Liu, Zhiyuan, et al.
Published: (2024)
Rethinking Tokenizer and Decoder in Masked Graph Modeling for Molecules
by: Liu, Zhiyuan, et al.
Published: (2023)
by: Liu, Zhiyuan, et al.
Published: (2023)
Can I Trust Your Answer? Visually Grounded Video Question Answering
by: Xiao, Junbin, et al.
Published: (2023)
by: Xiao, Junbin, et al.
Published: (2023)
Contrastive Pre-training for Deep Session Data Understanding
by: Li, Zixuan, et al.
Published: (2024)
by: Li, Zixuan, et al.
Published: (2024)
Turing Patterns for Multimedia: Reaction-Diffusion Multi-Modal Fusion for Language-Guided Video Moment Retrieval
by: Fang, Xiang, et al.
Published: (2026)
by: Fang, Xiang, et al.
Published: (2026)
Extending Visual Dynamics for Video-to-Music Generation
by: Liu, Xiaohao, et al.
Published: (2025)
by: Liu, Xiaohao, et al.
Published: (2025)
Enhancing Spectral Graph Neural Networks with LLM-Predicted Homophily
by: Lu, Kangkang, et al.
Published: (2025)
by: Lu, Kangkang, et al.
Published: (2025)
3D-TAFS: A Training-free Framework for 3D Affordance Segmentation
by: Chu, Meng, et al.
Published: (2024)
by: Chu, Meng, et al.
Published: (2024)
Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion
by: Zhou, Zhenglin, et al.
Published: (2025)
by: Zhou, Zhenglin, et al.
Published: (2025)
XNLP: An Interactive Demonstration System for Universal Structured NLP
by: Fei, Hao, et al.
Published: (2023)
by: Fei, Hao, et al.
Published: (2023)
A Survey on Neural Question Generation: Methods, Applications, and Prospects
by: Guo, Shasha, et al.
Published: (2024)
by: Guo, Shasha, et al.
Published: (2024)
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
by: Zhi, Peiyuan, et al.
Published: (2024)
by: Zhi, Peiyuan, et al.
Published: (2024)
NExT-Search: Rebuilding User Feedback Ecosystem for Generative AI Search
by: Dai, Sunhao, et al.
Published: (2025)
by: Dai, Sunhao, et al.
Published: (2025)
Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model
by: Shen, Fei, et al.
Published: (2025)
by: Shen, Fei, et al.
Published: (2025)
An LMM for Efficient Video Understanding via Reinforced Compression of Video Cubes
by: Qi, Ji, et al.
Published: (2025)
by: Qi, Ji, et al.
Published: (2025)
Principled Multimodal Representation Learning
by: Liu, Xiaohao, et al.
Published: (2025)
by: Liu, Xiaohao, et al.
Published: (2025)
On Generative Agents in Recommendation
by: Zhang, An, et al.
Published: (2023)
by: Zhang, An, et al.
Published: (2023)
LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation
by: He, Yingzhi, et al.
Published: (2025)
by: He, Yingzhi, et al.
Published: (2025)
Continual Multimodal Contrastive Learning
by: Liu, Xiaohao, et al.
Published: (2025)
by: Liu, Xiaohao, et al.
Published: (2025)
Aligning Large Language Models for Faithful Integrity Against Opposing Argument
by: Zhao, Yong, et al.
Published: (2025)
by: Zhao, Yong, et al.
Published: (2025)
Inverting the wedge map and Gauss composition
by: Chua, Kok Seng
Published: (2024)
by: Chua, Kok Seng
Published: (2024)
Similar Items
-
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
by: Yu, Tianyu, et al.
Published: (2023) -
RLPR: Extrapolating RLVR to General Domains without Verifiers
by: Yu, Tianyu, et al.
Published: (2025) -
NExT-GPT: Any-to-Any Multimodal LLM
by: Wu, Shengqiong, et al.
Published: (2023) -
Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art
by: Jin, Zhe, et al.
Published: (2025) -
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
by: Yao, Yuan, et al.
Published: (2024)