:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Chengtai, He, Yuting, Ren, Jianfeng, Bai, Ruibin, Zhao, Yitian, Yu, Heng, Jiang, Xudong
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.01125
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Multi-View Spectrogram Transformer for Respiratory Sound Classification
by: He, Wentao, et al.
Published: (2023)

ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps
by: Song, Xingke, et al.
Published: (2025)

Visual Perturbation and Adaptive Hard Negative Contrastive Learning for Compositional Reasoning in Vision-Language Models
by: Huang, Xin, et al.
Published: (2025)

LP$^{2}$DH: A Locality-Preserving Pixel-Difference Hashing Framework for Dynamic Texture Recognition
by: Ding, Ruxin, et al.
Published: (2026)

SignReasoner: Compositional Reasoning for Complex Traffic Sign Understanding via Functional Structure Units
by: Wang, Ruibin, et al.
Published: (2026)

MotionMERGE: A Multi-granular Framework for Human Motion Editing, Reasoning, Generation, and Explanation
by: Wu, Bizhu, et al.
Published: (2026)

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
by: Chen, Xi, et al.
Published: (2025)

Vector Contrastive Learning For Pixel-Wise Pretraining In Medical Vision
by: He, Yuting, et al.
Published: (2025)

Face recognition on point cloud with cgan-top for denoising
by: Liu, Junyu, et al.
Published: (2025)

Continuous Vision-Language-Action Co-Learning with Semantic-Physical Alignment for Behavioral Cloning
by: Qi, Xiuxiu, et al.
Published: (2025)

A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability
by: Cao, Chengtai, et al.
Published: (2022)

DIRCR: Dual-Inference Rule-Contrastive Reasoning for Solving RAVENs
by: Zhang, Jiachen, et al.
Published: (2026)

FineMotion: A Dataset and Benchmark with both Spatial and Temporal Annotation for Fine-grained Motion Generation and Editing
by: Wu, Bizhu, et al.
Published: (2025)

MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
by: Wu, Bizhu, et al.
Published: (2025)

SiamNAS: Siamese Surrogate Model for Dominance Relation Prediction in Multi-objective Neural Architecture Search
by: Zhou, Yuyang, et al.
Published: (2025)

Contrastive Visual Data Augmentation
by: Zhou, Yu, et al.
Published: (2025)

MathSticks: A Benchmark for Visual Symbolic Compositional Reasoning with Matchstick Puzzles
by: Ji, Yuheng, et al.
Published: (2025)

Unsupervised Industrial Anomaly Detection via Pattern Generative and Contrastive Networks
by: Huang, Jianfeng, et al.
Published: (2022)

Weakly Supervised Video Anomaly Detection with Anomaly-Connected Components and Intention Reasoning
by: Wang, Yu, et al.
Published: (2026)

Pattern based learning and optimisation through pricing for bin packing problem
by: Zhang, Huayan, et al.
Published: (2024)

Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images
by: Tan, Chuangchuang, et al.
Published: (2025)

Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery
by: Bai, Long, et al.
Published: (2024)

Visual Attention Prompted Prediction and Learning
by: Zhang, Yifei, et al.
Published: (2023)

Contrastive Cross-Bag Augmentation for Multiple Instance Learning-based Whole Slide Image Classification
by: Zhang, Bo, et al.
Published: (2025)

VisualTrans: A Benchmark for Real-World Visual Transformation Reasoning
by: Ji, Yuheng, et al.
Published: (2025)

GCA-SUNet: A Gated Context-Aware Swin-UNet for Exemplar-Free Counting
by: Wu, Yuzhe, et al.
Published: (2024)

Open-set Anomaly Segmentation in Complex Scenarios
by: Xia, Song, et al.
Published: (2025)

TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Dual-Level Scale-Oriented Contrast
by: Cui, Beilei, et al.
Published: (2025)

Scale Contrastive Learning with Selective Attentions for Blind Image Quality Assessment
by: Hu, Runze, et al.
Published: (2024)

The Composite Visual-Laser Navigation Method Applied in Indoor Poultry Farming Environments
by: Lu, Jiafan, et al.
Published: (2025)

Contrastive Augmented Transformer with Domain-specific Enhancement for Robust Multi-scenario Metal Surface Defect Detection
by: Liua, Yiyao, et al.
Published: (2026)

Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
by: Li, Xin, et al.
Published: (2024)

TOMCAT: Test-time Comprehensive Knowledge Accumulation for Compositional Zero-Shot Learning
by: Yan, Xudong, et al.
Published: (2025)

Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification
by: Bai, Tianyi, et al.
Published: (2025)

MRI Contrast Enhancement Kinetics World Model
by: Kong, Jindi, et al.
Published: (2026)

CoLVR: Enhancing Exploratory Latent Visual Reasoning via Contrastive Optimization
by: Ding, Ziyang, et al.
Published: (2026)

Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
by: Zhang, Haoji, et al.
Published: (2025)

Compositional Image Retrieval via Instruction-Aware Contrastive Learning
by: Zhong, Wenliang, et al.
Published: (2024)

GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
by: Chen, Xianyu, et al.
Published: (2024)

Structure Over Scale: Learning Visual Reasoning from Pedagogical Video
by: Galoaa, Bishoy, et al.
Published: (2026)