Saved in:
| Main Authors: | Miao, Yanting, Sun, Yutao, Wang, Dexin, Zhou, Mengyu, Poupart, Pascal, Lv, Lei, Zhao, Qi, Wang, Li, Li, Hao, Jiang, Xiaoxi, Jiang, Guanjun |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.12374 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills
by: Ni, Jingwei, et al.
Published: (2026)
by: Ni, Jingwei, et al.
Published: (2026)
Bringing Stability to Diffusion: Decomposing and Reducing Variance of Training Masked Diffusion Models
by: Jia, Mengni, et al.
Published: (2025)
by: Jia, Mengni, et al.
Published: (2025)
Image-POSER: Reflective RL for Multi-Expert Image Generation and Editing
by: Mohebbi, Hossein, et al.
Published: (2025)
by: Mohebbi, Hossein, et al.
Published: (2025)
Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning
by: Miao, Yanting, et al.
Published: (2024)
by: Miao, Yanting, et al.
Published: (2024)
Grounding the Score: Explicit Visual Premise Verification for Reliable Vision-Language Process Reward Models
by: Wang, Junxin, et al.
Published: (2026)
by: Wang, Junxin, et al.
Published: (2026)
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
by: Miao, Yanting, et al.
Published: (2025)
by: Miao, Yanting, et al.
Published: (2025)
Rationale Matters: Learning Transferable Rubrics via Proxy-Guided Critique for VLM Reward Models
by: Qiu, Weijie, et al.
Published: (2026)
by: Qiu, Weijie, et al.
Published: (2026)
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
by: Li, Zhuo, et al.
Published: (2026)
by: Li, Zhuo, et al.
Published: (2026)
Be Your Own Red Teamer: Safety Alignment via Self-Play and Reflective Experience Replay
by: Wang, Hao, et al.
Published: (2026)
by: Wang, Hao, et al.
Published: (2026)
ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation
by: Liu, Yinuo, et al.
Published: (2026)
by: Liu, Yinuo, et al.
Published: (2026)
Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric
by: Jia, Ruipeng, et al.
Published: (2026)
by: Jia, Ruipeng, et al.
Published: (2026)
Why Online Reinforcement Learning is Causal
by: Schulte, Oliver, et al.
Published: (2024)
by: Schulte, Oliver, et al.
Published: (2024)
GAP-MLLM: Geometry-Aligned Pre-training for Activating 3D Spatial Perception in Multimodal Large Language Models
by: Zhang, Jiaxin, et al.
Published: (2026)
by: Zhang, Jiaxin, et al.
Published: (2026)
Auxetic‐Assisted Decoupling Strategy for High‐Sensitivity Multimodal Sensing of Temperature and Strain
by: Hao Yin, et al.
Published: (2025)
by: Hao Yin, et al.
Published: (2025)
Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment
by: Zhao, Zhixian, et al.
Published: (2024)
by: Zhao, Zhixian, et al.
Published: (2024)
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models
by: Zhang, Wenqiao, et al.
Published: (2024)
by: Zhang, Wenqiao, et al.
Published: (2024)
Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm
by: Xu, Tianxiang, et al.
Published: (2026)
by: Xu, Tianxiang, et al.
Published: (2026)
OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval
by: Yang, Wei, et al.
Published: (2025)
by: Yang, Wei, et al.
Published: (2025)
Label Alignment Regularization for Distribution Shift
by: Imani, Ehsan, et al.
Published: (2022)
by: Imani, Ehsan, et al.
Published: (2022)
Annotation-Free Visual Reasoning for High-Resolution Large Multimodal Models via Reinforcement Learning
by: Yang, Jiacheng, et al.
Published: (2026)
by: Yang, Jiacheng, et al.
Published: (2026)
TimelineReasoner: Advancing Timeline Summarization with Large Reasoning Models
by: Zhang, Liancheng, et al.
Published: (2026)
by: Zhang, Liancheng, et al.
Published: (2026)
FedLog: Personalized Federated Classification with Less Communication and More Flexibility
by: Yu, Haolin, et al.
Published: (2024)
by: Yu, Haolin, et al.
Published: (2024)
Adaptive GPU Kinetic Solver for Fluid-Granular Flows
by: Li, Xingqiao, et al.
Published: (2026)
by: Li, Xingqiao, et al.
Published: (2026)
Chain of Visual Perception: Harnessing Multimodal Large Language Models for Zero-shot Camouflaged Object Detection
by: Tang, Lv, et al.
Published: (2023)
by: Tang, Lv, et al.
Published: (2023)
Search-o1: Agentic Search-Enhanced Large Reasoning Models
by: Li, Xiaoxi, et al.
Published: (2025)
by: Li, Xiaoxi, et al.
Published: (2025)
Filgotinib Improves Experimental Pulmonary Fibrosis by Modulating JAK1/STAT3/SOCS3/IL‐17A Signalling
by: Yunying Lv, et al.
Published: (2025)
by: Yunying Lv, et al.
Published: (2025)
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
by: Li, Xiaoxi, et al.
Published: (2025)
by: Li, Xiaoxi, et al.
Published: (2025)
Learning to Negotiate via Voluntary Commitment
by: Zhu, Shuhui, et al.
Published: (2025)
by: Zhu, Shuhui, et al.
Published: (2025)
An Evaluation-Centric Paradigm for Scientific Visualization Agents
by: Ai, Kuangshi, et al.
Published: (2025)
by: Ai, Kuangshi, et al.
Published: (2025)
Reflect-then-Plan: Offline Model-Based Planning through a Doubly Bayesian Lens
by: Jeong, Jihwan, et al.
Published: (2025)
by: Jeong, Jihwan, et al.
Published: (2025)
AutoTraces: Autoregressive Trajectory Forecasting via Multimodal Large Language Models
by: Wang, Teng, et al.
Published: (2026)
by: Wang, Teng, et al.
Published: (2026)
Basis Transformers for Multi-Task Tabular Regression
by: Loh, Wei Min, et al.
Published: (2025)
by: Loh, Wei Min, et al.
Published: (2025)
Writing-Zero: Bridge the Gap Between Non-verifiable Tasks and Verifiable Rewards
by: Jia, Ruipeng, et al.
Published: (2025)
by: Jia, Ruipeng, et al.
Published: (2025)
AVG-LLaVA: An Efficient Large Multimodal Model with Adaptive Visual Granularity
by: Lan, Zhibin, et al.
Published: (2024)
by: Lan, Zhibin, et al.
Published: (2024)
Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents
by: Poupart, Yoann
Published: (2024)
by: Poupart, Yoann
Published: (2024)
TDHook: A Lightweight Framework for Interpretability
by: Poupart, Yoann
Published: (2025)
by: Poupart, Yoann
Published: (2025)
VSA:Visual-Structural Alignment for UI-to-Code
by: Wu, Xian, et al.
Published: (2025)
by: Wu, Xian, et al.
Published: (2025)
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
by: Lv, Qi, et al.
Published: (2024)
by: Lv, Qi, et al.
Published: (2024)
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
by: Luo, Yudong, et al.
Published: (2024)
by: Luo, Yudong, et al.
Published: (2024)
FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents
by: Li, Bobo, et al.
Published: (2025)
by: Li, Bobo, et al.
Published: (2025)
Similar Items
-
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills
by: Ni, Jingwei, et al.
Published: (2026) -
Bringing Stability to Diffusion: Decomposing and Reducing Variance of Training Masked Diffusion Models
by: Jia, Mengni, et al.
Published: (2025) -
Image-POSER: Reflective RL for Multi-Expert Image Generation and Editing
by: Mohebbi, Hossein, et al.
Published: (2025) -
Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning
by: Miao, Yanting, et al.
Published: (2024) -
Grounding the Score: Explicit Visual Premise Verification for Reliable Vision-Language Process Reward Models
by: Wang, Junxin, et al.
Published: (2026)