Saved in:
| Main Authors: | Song, Ruizhuo, Yuan, Beiming |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.15387 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Triple-CFN: Separating Concepts and Features Enhances Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2024)
by: Song, Ruizhuo, et al.
Published: (2024)
Johnny: Structuring Representation Space to Enhance Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2025)
by: Song, Ruizhuo, et al.
Published: (2025)
Funny-Valen-Tine: Planning Solution Distribution Enhances Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2024)
by: Song, Ruizhuo, et al.
Published: (2024)
D4C: Improving Negative Example Quality to Enhance Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2024)
by: Song, Ruizhuo, et al.
Published: (2024)
Solving the Clustering Reasoning Problems by Modeling a Deep-Learning-Based Probabilistic Model
by: Song, Ruizhuo, et al.
Published: (2024)
by: Song, Ruizhuo, et al.
Published: (2024)
EiHi Net: Out-of-Distribution Generalization Paradigm
by: Wei, Qinglai, et al.
Published: (2022)
by: Wei, Qinglai, et al.
Published: (2022)
VLM-R$^3$: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
by: Jiang, Chaoya, et al.
Published: (2025)
by: Jiang, Chaoya, et al.
Published: (2025)
Multi-Granularity Mutual Refinement Network for Zero-Shot Learning
by: Wang, Ning, et al.
Published: (2025)
by: Wang, Ning, et al.
Published: (2025)
CoT-Pose: Chain-of-Thought Reasoning for 3D Pose Generation from Abstract Prompts
by: Cha, Junuk, et al.
Published: (2025)
by: Cha, Junuk, et al.
Published: (2025)
DIO: Dataset of 3D Mesh Models of Indoor Objects for Robotics and Computer Vision Applications
by: Nimal, Nillan, et al.
Published: (2024)
by: Nimal, Nillan, et al.
Published: (2024)
VISTA: Enhancing Vision-Text Alignment in MLLMs via Cross-Modal Mutual Information Maximization
by: Li, Mingxiao, et al.
Published: (2025)
by: Li, Mingxiao, et al.
Published: (2025)
Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence
by: Xu, Ruizhuo, et al.
Published: (2024)
by: Xu, Ruizhuo, et al.
Published: (2024)
Enhancing Advanced Visual Reasoning Ability of Large Language Models
by: Li, Zhiyuan, et al.
Published: (2024)
by: Li, Zhiyuan, et al.
Published: (2024)
MIGE: Mutually Enhanced Multimodal Instruction-Based Image Generation and Editing
by: Tian, Xueyun, et al.
Published: (2025)
by: Tian, Xueyun, et al.
Published: (2025)
CoT-Segmenter: Enhancing OOD Detection in Dense Road Scenes via Chain-of-Thought Reasoning
by: Song, Jeonghyo, et al.
Published: (2025)
by: Song, Jeonghyo, et al.
Published: (2025)
CLGRPO: Reasoning Ability Enhancement for Small VLMs
by: Wang, Fanyi, et al.
Published: (2025)
by: Wang, Fanyi, et al.
Published: (2025)
CausalSpatial: A Benchmark for Object-Centric Causal Spatial Reasoning
by: Ma, Wenxin, et al.
Published: (2026)
by: Ma, Wenxin, et al.
Published: (2026)
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
by: Wang, Weiyun, et al.
Published: (2024)
by: Wang, Weiyun, et al.
Published: (2024)
MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation
by: Li, Guanghao, et al.
Published: (2025)
by: Li, Guanghao, et al.
Published: (2025)
CaST-Bench: Benchmarking Causal Chain-Grounded Spatio-Temporal Reasoning for Video Question Answering
by: Zhang, Mingfang, et al.
Published: (2026)
by: Zhang, Mingfang, et al.
Published: (2026)
Camera-Based Localization and Enhanced Normalized Mutual Information
by: Kunde, Vishnu Teja, et al.
Published: (2024)
by: Kunde, Vishnu Teja, et al.
Published: (2024)
TIMA: Text-Image Mutual Awareness for Balancing Zero-Shot Adversarial Robustness and Generalization Ability
by: Ma, Fengji, et al.
Published: (2024)
by: Ma, Fengji, et al.
Published: (2024)
Mutual Information Guided Optimal Transport for Unsupervised Visible-Infrared Person Re-identification
by: Zhang, Zhizhong, et al.
Published: (2024)
by: Zhang, Zhizhong, et al.
Published: (2024)
Depth Map Denoising Network and Lightweight Fusion Network for Enhanced 3D Face Recognition
by: Xu, Ruizhuo, et al.
Published: (2024)
by: Xu, Ruizhuo, et al.
Published: (2024)
MutualNeRF: Improve the Performance of NeRF under Limited Samples with Mutual Information Theory
by: Wang, Zifan, et al.
Published: (2025)
by: Wang, Zifan, et al.
Published: (2025)
Mutually Causal Semantic Distillation Network for Zero-Shot Learning
by: Chen, Shiming, et al.
Published: (2026)
by: Chen, Shiming, et al.
Published: (2026)
MI CAM: Mutual Information Weighted Activation Mapping for Causal Visual Explanations of Convolutional Neural Networks
by: Iyer, Ram S, et al.
Published: (2025)
by: Iyer, Ram S, et al.
Published: (2025)
OrderChain: Towards General Instruct-Tuning for Stimulating the Ordinal Understanding Ability of MLLM
by: Wang, Jinhong, et al.
Published: (2025)
by: Wang, Jinhong, et al.
Published: (2025)
Generating Storytelling Images with Rich Chains-of-Reasoning
by: Song, Xiujie, et al.
Published: (2025)
by: Song, Xiujie, et al.
Published: (2025)
Enhancing 3D Semantic Scene Completion with a Refinement Module
by: Zhang, Dunxing, et al.
Published: (2025)
by: Zhang, Dunxing, et al.
Published: (2025)
$A^2R^2$: Advancing Img2LaTeX Conversion via Visual Reasoning with Attention-Guided Refinement
by: Li, Zhecheng, et al.
Published: (2025)
by: Li, Zhecheng, et al.
Published: (2025)
Adaptive Chain-of-Focus Reasoning via Dynamic Visual Search and Zooming for Efficient VLMs
by: Zhang, Xintong, et al.
Published: (2025)
by: Zhang, Xintong, et al.
Published: (2025)
Geospatial Chain of Thought Reasoning for Enhanced Visual Question Answering on Satellite Imagery
by: Shanker, Shambhavi, et al.
Published: (2025)
by: Shanker, Shambhavi, et al.
Published: (2025)
OmniRefiner: Reinforcement-Guided Local Diffusion Refinement
by: Liu, Yaoli, et al.
Published: (2025)
by: Liu, Yaoli, et al.
Published: (2025)
Improving Visual Reasoning with Iterative Evidence Refinement
by: Shi, Zeru, et al.
Published: (2026)
by: Shi, Zeru, et al.
Published: (2026)
Revisiting Mutual Information Maximization for Generalized Category Discovery
by: Tan, Zhaorui, et al.
Published: (2024)
by: Tan, Zhaorui, et al.
Published: (2024)
TimeCausality: Evaluating the Causal Ability in Time Dimension for Vision Language Models
by: Wang, Zeqing, et al.
Published: (2025)
by: Wang, Zeqing, et al.
Published: (2025)
Explaining Representation by Mutual Information
by: Gu, Lifeng
Published: (2021)
by: Gu, Lifeng
Published: (2021)
A Multi-Agent Framework with Structured Reasoning and Reflective Refinement for Multimodal Empathetic Response Generation
by: Wang, Liping, et al.
Published: (2026)
by: Wang, Liping, et al.
Published: (2026)
VisualQuest: A Benchmark for Abstract Visual Reasoning in MLLMs
by: Xiao, Kelaiti, et al.
Published: (2025)
by: Xiao, Kelaiti, et al.
Published: (2025)
Similar Items
-
Triple-CFN: Separating Concepts and Features Enhances Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2024) -
Johnny: Structuring Representation Space to Enhance Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2025) -
Funny-Valen-Tine: Planning Solution Distribution Enhances Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2024) -
D4C: Improving Negative Example Quality to Enhance Machine Abstract Reasoning Ability
by: Song, Ruizhuo, et al.
Published: (2024) -
Solving the Clustering Reasoning Problems by Modeling a Deep-Learning-Based Probabilistic Model
by: Song, Ruizhuo, et al.
Published: (2024)