:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gu, Zihan, Chen, Ruoyu, Zhang, Junchi, Hu, Yue, Zhang, Hua, Cao, Xiaochun
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.10914
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Interpreting Object-level Foundation Models via Visual Precision Search
by: Chen, Ruoyu, et al.
Published: (2024)

Less is More: Fewer Interpretable Region via Submodular Subset Selection
by: Chen, Ruoyu, et al.
Published: (2024)

Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
by: Chen, Ruoyu, et al.
Published: (2025)

Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection
by: Chen, Ruoyu, et al.
Published: (2025)

Phantom-Insight: Adaptive Multi-cue Fusion for Video Camouflaged Object Detection with Multimodal LLM
by: Zhang, Hua, et al.
Published: (2025)

FaceInsight: A Multimodal Large Language Model for Face Perception
by: Li, Jingzhi, et al.
Published: (2025)

Object Detectors in the Open Environment: Challenges, Solutions, and Outlook
by: Liang, Siyuan, et al.
Published: (2024)

Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation
by: Chen, Ruoyu, et al.
Published: (2025)

Explaining multimodal LLMs via intra-modal token interactions
by: Liang, Jiawei, et al.
Published: (2025)

Where Not to Learn: Prior-Aligned Training with Subset-based Attribution Constraints for Reliable Decision-Making
by: Chen, Ruoyu, et al.
Published: (2026)

Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation
by: Chen, Yannan, et al.
Published: (2025)

Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach
by: Zhang, Beichen, et al.
Published: (2024)

Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors
by: Zhang, Aiping, et al.
Published: (2024)

Dictionary-based Framework for Interpretable and Consistent Object Parsing
by: Zhang, Tiezheng, et al.
Published: (2025)

Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales
by: Qi, Shuren, et al.
Published: (2024)

Boundary Matters: A Bi-Level Active Finetuning Framework
by: Lu, Han, et al.
Published: (2024)

M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios
by: Liao, Ning, et al.
Published: (2023)

Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks
by: Zhou, Chen, et al.
Published: (2024)

Tracking by Detection and Query: An Efficient End-to-End Framework for Multi-Object Tracking
by: Jia, Shukun, et al.
Published: (2024)

SinSEMI: A One-Shot Image Generation Model and Data-Efficient Evaluation Framework for Semiconductor Inspection Equipment
by: Wu, ChunLiang, et al.
Published: (2025)

UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network
by: Yao, Siyuan, et al.
Published: (2025)

PersGuard: Preventing Malicious Personalization via Backdoor Attacks on Pre-trained Text-to-Image Diffusion Models
by: Liu, Xinwei, et al.
Published: (2025)

LR-FPN: Enhancing Remote Sensing Object Detection with Location Refined Feature Pyramid Network
by: Li, Hanqian, et al.
Published: (2024)

UETrack: A Unified and Efficient Framework for Single Object Tracking
by: Kang, Ben, et al.
Published: (2026)

Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection
by: Yao, Siyuan, et al.
Published: (2024)

Hoi2Threat: An Interpretable Threat Detection Method for Human Violence Scenarios Guided by Human-Object Interaction
by: Wang, Yuhan, et al.
Published: (2025)

Mitigating Group-Level Fairness Disparities in Federated Visual Language Models
by: Chen, Chaomeng, et al.
Published: (2025)

AttDiff-GAN: A Hybrid Diffusion-GAN Framework for Facial Attribute Editing
by: Huang, Wenmin, et al.
Published: (2026)

Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation
by: Liang, Jiawei, et al.
Published: (2022)

General Compression Framework for Efficient Transformer Object Tracking
by: Hong, Lingyi, et al.
Published: (2024)

ORMOT: A Dataset and Framework for Omnidirectional Referring Multi-Object Tracking
by: Chen, Sijia, et al.
Published: (2026)

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models
by: Chen, Junzhe, et al.
Published: (2024)

CAMotion: A High-Quality Benchmark for Camouflaged Moving Object Detection in the Wild
by: Yao, Siyuan, et al.
Published: (2026)

WinTok: A Win-Win Hybrid Tokenizer via Decomposing Visual Understanding and Generation with Transferable Tokens
by: Guo, Yiwei, et al.
Published: (2026)

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
by: Zhang, Xiangdong, et al.
Published: (2024)

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
by: Zhang, Xiangdong, et al.
Published: (2025)

Efficient One-stage Video Object Detection by Exploiting Temporal Consistency
by: Sun, Guanxiong, et al.
Published: (2024)

Efficient and Deterministic Search Strategy Based on Residual Projections for Point Cloud Registration with Correspondences
by: Li, Xinyi, et al.
Published: (2023)

CL-HOI: Cross-Level Human-Object Interaction Distillation from Vision Large Language Models
by: Gao, Jianjun, et al.
Published: (2024)

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation
by: Huang, Zinqin, et al.
Published: (2025)