Saved in:
| Main Authors: | Gu, Zihan, Chen, Ruoyu, Zhang, Junchi, Hu, Yue, Zhang, Hua, Cao, Xiaochun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.10914 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Interpreting Object-level Foundation Models via Visual Precision Search
by: Chen, Ruoyu, et al.
Published: (2024)
by: Chen, Ruoyu, et al.
Published: (2024)
Less is More: Fewer Interpretable Region via Submodular Subset Selection
by: Chen, Ruoyu, et al.
Published: (2024)
by: Chen, Ruoyu, et al.
Published: (2024)
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
by: Chen, Ruoyu, et al.
Published: (2025)
by: Chen, Ruoyu, et al.
Published: (2025)
Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection
by: Chen, Ruoyu, et al.
Published: (2025)
by: Chen, Ruoyu, et al.
Published: (2025)
Phantom-Insight: Adaptive Multi-cue Fusion for Video Camouflaged Object Detection with Multimodal LLM
by: Zhang, Hua, et al.
Published: (2025)
by: Zhang, Hua, et al.
Published: (2025)
FaceInsight: A Multimodal Large Language Model for Face Perception
by: Li, Jingzhi, et al.
Published: (2025)
by: Li, Jingzhi, et al.
Published: (2025)
Object Detectors in the Open Environment: Challenges, Solutions, and Outlook
by: Liang, Siyuan, et al.
Published: (2024)
by: Liang, Siyuan, et al.
Published: (2024)
Where MLLMs Attend and What They Rely On: Explaining Autoregressive Token Generation
by: Chen, Ruoyu, et al.
Published: (2025)
by: Chen, Ruoyu, et al.
Published: (2025)
Explaining multimodal LLMs via intra-modal token interactions
by: Liang, Jiawei, et al.
Published: (2025)
by: Liang, Jiawei, et al.
Published: (2025)
Where Not to Learn: Prior-Aligned Training with Subset-based Attribution Constraints for Reliable Decision-Making
by: Chen, Ruoyu, et al.
Published: (2026)
by: Chen, Ruoyu, et al.
Published: (2026)
Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation
by: Chen, Yannan, et al.
Published: (2025)
by: Chen, Yannan, et al.
Published: (2025)
Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach
by: Zhang, Beichen, et al.
Published: (2024)
by: Zhang, Beichen, et al.
Published: (2024)
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors
by: Zhang, Aiping, et al.
Published: (2024)
by: Zhang, Aiping, et al.
Published: (2024)
Dictionary-based Framework for Interpretable and Consistent Object Parsing
by: Zhang, Tiezheng, et al.
Published: (2025)
by: Zhang, Tiezheng, et al.
Published: (2025)
Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales
by: Qi, Shuren, et al.
Published: (2024)
by: Qi, Shuren, et al.
Published: (2024)
Boundary Matters: A Bi-Level Active Finetuning Framework
by: Lu, Han, et al.
Published: (2024)
by: Lu, Han, et al.
Published: (2024)
M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios
by: Liao, Ning, et al.
Published: (2023)
by: Liao, Ning, et al.
Published: (2023)
Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks
by: Zhou, Chen, et al.
Published: (2024)
by: Zhou, Chen, et al.
Published: (2024)
Tracking by Detection and Query: An Efficient End-to-End Framework for Multi-Object Tracking
by: Jia, Shukun, et al.
Published: (2024)
by: Jia, Shukun, et al.
Published: (2024)
SinSEMI: A One-Shot Image Generation Model and Data-Efficient Evaluation Framework for Semiconductor Inspection Equipment
by: Wu, ChunLiang, et al.
Published: (2025)
by: Wu, ChunLiang, et al.
Published: (2025)
UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network
by: Yao, Siyuan, et al.
Published: (2025)
by: Yao, Siyuan, et al.
Published: (2025)
PersGuard: Preventing Malicious Personalization via Backdoor Attacks on Pre-trained Text-to-Image Diffusion Models
by: Liu, Xinwei, et al.
Published: (2025)
by: Liu, Xinwei, et al.
Published: (2025)
LR-FPN: Enhancing Remote Sensing Object Detection with Location Refined Feature Pyramid Network
by: Li, Hanqian, et al.
Published: (2024)
by: Li, Hanqian, et al.
Published: (2024)
UETrack: A Unified and Efficient Framework for Single Object Tracking
by: Kang, Ben, et al.
Published: (2026)
by: Kang, Ben, et al.
Published: (2026)
Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection
by: Yao, Siyuan, et al.
Published: (2024)
by: Yao, Siyuan, et al.
Published: (2024)
Hoi2Threat: An Interpretable Threat Detection Method for Human Violence Scenarios Guided by Human-Object Interaction
by: Wang, Yuhan, et al.
Published: (2025)
by: Wang, Yuhan, et al.
Published: (2025)
Mitigating Group-Level Fairness Disparities in Federated Visual Language Models
by: Chen, Chaomeng, et al.
Published: (2025)
by: Chen, Chaomeng, et al.
Published: (2025)
AttDiff-GAN: A Hybrid Diffusion-GAN Framework for Facial Attribute Editing
by: Huang, Wenmin, et al.
Published: (2026)
by: Huang, Wenmin, et al.
Published: (2026)
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation
by: Liang, Jiawei, et al.
Published: (2022)
by: Liang, Jiawei, et al.
Published: (2022)
General Compression Framework for Efficient Transformer Object Tracking
by: Hong, Lingyi, et al.
Published: (2024)
by: Hong, Lingyi, et al.
Published: (2024)
ORMOT: A Dataset and Framework for Omnidirectional Referring Multi-Object Tracking
by: Chen, Sijia, et al.
Published: (2026)
by: Chen, Sijia, et al.
Published: (2026)
ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models
by: Chen, Junzhe, et al.
Published: (2024)
by: Chen, Junzhe, et al.
Published: (2024)
CAMotion: A High-Quality Benchmark for Camouflaged Moving Object Detection in the Wild
by: Yao, Siyuan, et al.
Published: (2026)
by: Yao, Siyuan, et al.
Published: (2026)
WinTok: A Win-Win Hybrid Tokenizer via Decomposing Visual Understanding and Generation with Transferable Tokens
by: Guo, Yiwei, et al.
Published: (2026)
by: Guo, Yiwei, et al.
Published: (2026)
PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
by: Zhang, Xiangdong, et al.
Published: (2024)
by: Zhang, Xiangdong, et al.
Published: (2024)
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
by: Zhang, Xiangdong, et al.
Published: (2025)
by: Zhang, Xiangdong, et al.
Published: (2025)
Efficient One-stage Video Object Detection by Exploiting Temporal Consistency
by: Sun, Guanxiong, et al.
Published: (2024)
by: Sun, Guanxiong, et al.
Published: (2024)
Efficient and Deterministic Search Strategy Based on Residual Projections for Point Cloud Registration with Correspondences
by: Li, Xinyi, et al.
Published: (2023)
by: Li, Xinyi, et al.
Published: (2023)
CL-HOI: Cross-Level Human-Object Interaction Distillation from Vision Large Language Models
by: Gao, Jianjun, et al.
Published: (2024)
by: Gao, Jianjun, et al.
Published: (2024)
GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation
by: Huang, Zinqin, et al.
Published: (2025)
by: Huang, Zinqin, et al.
Published: (2025)
Similar Items
-
Interpreting Object-level Foundation Models via Visual Precision Search
by: Chen, Ruoyu, et al.
Published: (2024) -
Less is More: Fewer Interpretable Region via Submodular Subset Selection
by: Chen, Ruoyu, et al.
Published: (2024) -
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection
by: Chen, Ruoyu, et al.
Published: (2025) -
Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection
by: Chen, Ruoyu, et al.
Published: (2025) -
Phantom-Insight: Adaptive Multi-cue Fusion for Video Camouflaged Object Detection with Multimodal LLM
by: Zhang, Hua, et al.
Published: (2025)