Saved in:
| Main Authors: | Li, Chengtai, He, Yuting, Ren, Jianfeng, Bai, Ruibin, Zhao, Yitian, Yu, Heng, Jiang, Xudong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.01125 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-View Spectrogram Transformer for Respiratory Sound Classification
by: He, Wentao, et al.
Published: (2023)
by: He, Wentao, et al.
Published: (2023)
ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps
by: Song, Xingke, et al.
Published: (2025)
by: Song, Xingke, et al.
Published: (2025)
Visual Perturbation and Adaptive Hard Negative Contrastive Learning for Compositional Reasoning in Vision-Language Models
by: Huang, Xin, et al.
Published: (2025)
by: Huang, Xin, et al.
Published: (2025)
LP$^{2}$DH: A Locality-Preserving Pixel-Difference Hashing Framework for Dynamic Texture Recognition
by: Ding, Ruxin, et al.
Published: (2026)
by: Ding, Ruxin, et al.
Published: (2026)
SignReasoner: Compositional Reasoning for Complex Traffic Sign Understanding via Functional Structure Units
by: Wang, Ruibin, et al.
Published: (2026)
by: Wang, Ruibin, et al.
Published: (2026)
MotionMERGE: A Multi-granular Framework for Human Motion Editing, Reasoning, Generation, and Explanation
by: Wu, Bizhu, et al.
Published: (2026)
by: Wu, Bizhu, et al.
Published: (2026)
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
by: Chen, Xi, et al.
Published: (2025)
by: Chen, Xi, et al.
Published: (2025)
Vector Contrastive Learning For Pixel-Wise Pretraining In Medical Vision
by: He, Yuting, et al.
Published: (2025)
by: He, Yuting, et al.
Published: (2025)
Face recognition on point cloud with cgan-top for denoising
by: Liu, Junyu, et al.
Published: (2025)
by: Liu, Junyu, et al.
Published: (2025)
Continuous Vision-Language-Action Co-Learning with Semantic-Physical Alignment for Behavioral Cloning
by: Qi, Xiuxiu, et al.
Published: (2025)
by: Qi, Xiuxiu, et al.
Published: (2025)
A Survey of Mix-based Data Augmentation: Taxonomy, Methods, Applications, and Explainability
by: Cao, Chengtai, et al.
Published: (2022)
by: Cao, Chengtai, et al.
Published: (2022)
DIRCR: Dual-Inference Rule-Contrastive Reasoning for Solving RAVENs
by: Zhang, Jiachen, et al.
Published: (2026)
by: Zhang, Jiachen, et al.
Published: (2026)
FineMotion: A Dataset and Benchmark with both Spatial and Temporal Annotation for Fine-grained Motion Generation and Editing
by: Wu, Bizhu, et al.
Published: (2025)
by: Wu, Bizhu, et al.
Published: (2025)
MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities
by: Wu, Bizhu, et al.
Published: (2025)
by: Wu, Bizhu, et al.
Published: (2025)
SiamNAS: Siamese Surrogate Model for Dominance Relation Prediction in Multi-objective Neural Architecture Search
by: Zhou, Yuyang, et al.
Published: (2025)
by: Zhou, Yuyang, et al.
Published: (2025)
Contrastive Visual Data Augmentation
by: Zhou, Yu, et al.
Published: (2025)
by: Zhou, Yu, et al.
Published: (2025)
MathSticks: A Benchmark for Visual Symbolic Compositional Reasoning with Matchstick Puzzles
by: Ji, Yuheng, et al.
Published: (2025)
by: Ji, Yuheng, et al.
Published: (2025)
Unsupervised Industrial Anomaly Detection via Pattern Generative and Contrastive Networks
by: Huang, Jianfeng, et al.
Published: (2022)
by: Huang, Jianfeng, et al.
Published: (2022)
Weakly Supervised Video Anomaly Detection with Anomaly-Connected Components and Intention Reasoning
by: Wang, Yu, et al.
Published: (2026)
by: Wang, Yu, et al.
Published: (2026)
Pattern based learning and optimisation through pricing for bin packing problem
by: Zhang, Huayan, et al.
Published: (2024)
by: Zhang, Huayan, et al.
Published: (2024)
Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images
by: Tan, Chuangchuang, et al.
Published: (2025)
by: Tan, Chuangchuang, et al.
Published: (2025)
Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery
by: Bai, Long, et al.
Published: (2024)
by: Bai, Long, et al.
Published: (2024)
Visual Attention Prompted Prediction and Learning
by: Zhang, Yifei, et al.
Published: (2023)
by: Zhang, Yifei, et al.
Published: (2023)
Contrastive Cross-Bag Augmentation for Multiple Instance Learning-based Whole Slide Image Classification
by: Zhang, Bo, et al.
Published: (2025)
by: Zhang, Bo, et al.
Published: (2025)
VisualTrans: A Benchmark for Real-World Visual Transformation Reasoning
by: Ji, Yuheng, et al.
Published: (2025)
by: Ji, Yuheng, et al.
Published: (2025)
GCA-SUNet: A Gated Context-Aware Swin-UNet for Exemplar-Free Counting
by: Wu, Yuzhe, et al.
Published: (2024)
by: Wu, Yuzhe, et al.
Published: (2024)
Open-set Anomaly Segmentation in Complex Scenarios
by: Xia, Song, et al.
Published: (2025)
by: Xia, Song, et al.
Published: (2025)
TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Dual-Level Scale-Oriented Contrast
by: Cui, Beilei, et al.
Published: (2025)
by: Cui, Beilei, et al.
Published: (2025)
Scale Contrastive Learning with Selective Attentions for Blind Image Quality Assessment
by: Hu, Runze, et al.
Published: (2024)
by: Hu, Runze, et al.
Published: (2024)
The Composite Visual-Laser Navigation Method Applied in Indoor Poultry Farming Environments
by: Lu, Jiafan, et al.
Published: (2025)
by: Lu, Jiafan, et al.
Published: (2025)
Contrastive Augmented Transformer with Domain-specific Enhancement for Robust Multi-scenario Metal Surface Defect Detection
by: Liua, Yiyao, et al.
Published: (2026)
by: Liua, Yiyao, et al.
Published: (2026)
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
by: Li, Xin, et al.
Published: (2024)
by: Li, Xin, et al.
Published: (2024)
TOMCAT: Test-time Comprehensive Knowledge Accumulation for Compositional Zero-Shot Learning
by: Yan, Xudong, et al.
Published: (2025)
by: Yan, Xudong, et al.
Published: (2025)
Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification
by: Bai, Tianyi, et al.
Published: (2025)
by: Bai, Tianyi, et al.
Published: (2025)
MRI Contrast Enhancement Kinetics World Model
by: Kong, Jindi, et al.
Published: (2026)
by: Kong, Jindi, et al.
Published: (2026)
CoLVR: Enhancing Exploratory Latent Visual Reasoning via Contrastive Optimization
by: Ding, Ziyang, et al.
Published: (2026)
by: Ding, Ziyang, et al.
Published: (2026)
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
by: Zhang, Haoji, et al.
Published: (2025)
by: Zhang, Haoji, et al.
Published: (2025)
Compositional Image Retrieval via Instruction-Aware Contrastive Learning
by: Zhong, Wenliang, et al.
Published: (2024)
by: Zhong, Wenliang, et al.
Published: (2024)
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
by: Chen, Xianyu, et al.
Published: (2024)
by: Chen, Xianyu, et al.
Published: (2024)
Structure Over Scale: Learning Visual Reasoning from Pedagogical Video
by: Galoaa, Bishoy, et al.
Published: (2026)
by: Galoaa, Bishoy, et al.
Published: (2026)
Similar Items
-
Multi-View Spectrogram Transformer for Respiratory Sound Classification
by: He, Wentao, et al.
Published: (2023) -
ERL-MPP: Evolutionary Reinforcement Learning with Multi-head Puzzle Perception for Solving Large-scale Jigsaw Puzzles of Eroded Gaps
by: Song, Xingke, et al.
Published: (2025) -
Visual Perturbation and Adaptive Hard Negative Contrastive Learning for Compositional Reasoning in Vision-Language Models
by: Huang, Xin, et al.
Published: (2025) -
LP$^{2}$DH: A Locality-Preserving Pixel-Difference Hashing Framework for Dynamic Texture Recognition
by: Ding, Ruxin, et al.
Published: (2026) -
SignReasoner: Compositional Reasoning for Complex Traffic Sign Understanding via Functional Structure Units
by: Wang, Ruibin, et al.
Published: (2026)