:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cai, Yuanhao, Li, Kunpeng, Jia, Menglin, Wang, Jialiang, Sun, Junzhe, Liang, Feng, Chen, Weifeng, Juefei-Xu, Felix, Wang, Chu, Thabet, Ali, Dai, Xiaoliang, Ju, Xuan, Yuille, Alan, Hou, Ji
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2512.24551
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Exploring MLLM-Diffusion Information Transfer with MetaCanvas
by: Lin, Han, et al.
Published: (2025)

DGPO: Beyond Pairwise Preferences with Directional Consistent Groupwise Optimization
by: Deng, Mengyi, et al.
Published: (2026)

Llama Learns to Direct: DirectorLLM for Human-Centric Video Generation
by: Song, Kunpeng, et al.
Published: (2024)

Improving Chain-of-Thought Efficiency for Autoregressive Image Generation
by: Gu, Zeqi, et al.
Published: (2025)

GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution
by: Yi, Qiaosi, et al.
Published: (2026)

MetaGDPO: Alleviating Catastrophic Forgetting with Metacognitive Knowledge through Group Direct Preference Optimization
by: Zhang, Lanxue, et al.
Published: (2025)

Dictionary-based Framework for Interpretable and Consistent Object Parsing
by: Zhang, Tiezheng, et al.
Published: (2025)

Structure-Aware Sparse-View X-ray 3D Reconstruction
by: Cai, Yuanhao, et al.
Published: (2023)

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling
by: Chen, Leon Liangyu, et al.
Published: (2026)

Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning
by: Zhang, Lei, et al.
Published: (2026)

GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
by: Kwon, Oh Joon, et al.
Published: (2024)

Non-Markov Multi-Round Conversational Image Generation with History-Conditioned MLLMs
by: Zhang, Haochen, et al.
Published: (2026)

RDPO: Real Data Preference Optimization for Physics Consistency Video Generation
by: Qian, Wenxu, et al.
Published: (2025)

Can These Views Be One Scene? Evaluating Multiview 3D Consistency when 3D Foundation Models Hallucinate
by: Paul, Soumava, et al.
Published: (2026)

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
by: Wang, Hongjie, et al.
Published: (2024)

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts
by: Liang, Feng, et al.
Published: (2025)

MoCha: Towards Movie-Grade Talking Character Synthesis
by: Wei, Cong, et al.
Published: (2025)

Transfer between Modalities with MetaQueries
by: Pan, Xichen, et al.
Published: (2025)

DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation
by: Ju, Xiaoliang, et al.
Published: (2025)

Pixel-Space Post-Training of Latent Diffusion Models
by: Zhang, Christina, et al.
Published: (2024)

Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
by: Wang, Xingrui, et al.
Published: (2024)

Causal Inference with Groupwise Matching
by: Rincón, Ratzanyel, et al.
Published: (2025)

Groupwise image registration with edge‐based loss for low‐SNR cardiac MRI
by: Xuan Lei, et al.
Published: (2025)

Groupwise Image Registration with Edge-Based Loss for Low-SNR Cardiac MRI
by: Lei, Xuan, et al.
Published: (2024)

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis
by: Cai, Yuanhao, et al.
Published: (2024)

Phy124: Fast Physics-Driven 4D Content Generation from a Single Image
by: Lin, Jiajing, et al.
Published: (2024)

PhyScensis: Physics-Augmented LLM Agents for Complex Physical Scene Arrangement
by: Wang, Yian, et al.
Published: (2026)

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models
by: Ma, Xu, et al.
Published: (2025)

SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
by: Wang, Feng, et al.
Published: (2023)

PhyScene3D: Physically Consistent Interactive 3D Tabletop Scene Generation
by: Chen, Weixing, et al.
Published: (2026)

Groupwise Registration with Physics-Informed Test-Time Adaptation on Multi-parametric Cardiac MRI
by: Li, Xinqi, et al.
Published: (2025)

PhyCritic: Multimodal Critic Models for Physical AI
by: Xiong, Tianyi, et al.
Published: (2026)

ProPhy: Progressive Physical Alignment for Dynamic World Simulation
by: Wang, Zijun, et al.
Published: (2025)

Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models
by: Ju, Li, et al.
Published: (2026)

PhyRecon: Physically Plausible Neural Scene Reconstruction
by: Ni, Junfeng, et al.
Published: (2024)

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
by: Liu, Shih-Yang, et al.
Published: (2026)

PhyDetEx: Detecting and Explaining the Physical Plausibility of T2V Models
by: Wang, Zeqing, et al.
Published: (2025)

Computer Vision and Its Relationship to Cognitive Science: A perspective from Bayes Decision Theory
by: Yuille, Alan, et al.
Published: (2026)

PhyWorld: Physics-Faithful World Model for Video Generation
by: Zhao, Pu, et al.
Published: (2026)

PhySense: Sensor Placement Optimization for Accurate Physics Sensing
by: Ma, Yuezhou, et al.
Published: (2025)