:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tang, Luyao, Yuan, Yuxuan, Chen, Chaoqi, Huang, Kunze, Ding, Xinghao, Huang, Yue
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2408.16310
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction
by: Tang, Luyao, et al.
Published: (2025)

Mixstyle-Entropy: Domain Generalization with Causal Intervention and Perturbation
by: Tang, Luyao, et al.
Published: (2024)

OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
by: Tang, Luyao, et al.
Published: (2025)

Object-Centric Pretraining via Target Encoder Bootstrapping
by: Đukić, Nikola, et al.
Published: (2025)

Improving the Generalization of Segmentation Foundation Model under Distribution Shift via Weakly Supervised Adaptation
by: Zhang, Haojie, et al.
Published: (2023)

Out-of-Distribution Detection with Prototypical Outlier Proxy
by: Gong, Mingrong, et al.
Published: (2024)

Learning Global Object-Centric Representations via Disentangled Slot Attention
by: Chen, Tonglin, et al.
Published: (2024)

Back to Source: Open-Set Continual Test-Time Adaptation via Domain Compensation
by: Yang, Yingkai, et al.
Published: (2026)

Looking Locally: Object-Centric Vision Transformers as Foundation Models for Efficient Segmentation
by: Traub, Manuel, et al.
Published: (2025)

Vector-Quantized Vision Foundation Models for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2025)

HyperTTA: Test-Time Adaptation for Hyperspectral Image Classification under Distribution Shifts
by: Yue, Xia, et al.
Published: (2025)

ContextFusion and Bootstrap: An Effective Approach to Improve Slot Attention-Based Object-Centric Learning
by: Tian, Pinzhuo, et al.
Published: (2025)

Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift
by: Qiu, Jielin, et al.
Published: (2022)

Track Any Anomalous Object: A Granular Video Anomaly Detection Pipeline
by: Huang, Yuzhi, et al.
Published: (2025)

COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
by: Li, Jiansheng, et al.
Published: (2025)

Spatial Reasoning in Foundation Models: Benchmarking Object-Centric Spatial Understanding
by: Mirjalili, Vahid, et al.
Published: (2025)

CrossEarth-SAR: A SAR-Centric and Billion-Scale Geospatial Foundation Model for Domain Generalizable Semantic Segmentation
by: Ye, Ziqi, et al.
Published: (2026)

EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation
by: Kim, Chanyoung, et al.
Published: (2024)

Bootstrapping MLLM for Weakly-Supervised Class-Agnostic Object Counting
by: Zhang, Xiaowen, et al.
Published: (2026)

Bootstrapping SparseFormers from Vision Foundation Models
by: Gao, Ziteng, et al.
Published: (2023)

Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation
by: Ni, Zhenliang, et al.
Published: (2024)

LMMs Meet Object-Centric Vision: Understanding, Segmentation, Editing and Generation
by: Yuan, Yuqian, et al.
Published: (2026)

Divide and Conquer: Object Co-occurrence Helps Mitigate Simplicity Bias in OOD Detection
by: Dai, Boyang, et al.
Published: (2026)

Open-Vocabulary Object Detectors: Robustness Challenges under Distribution Shifts
by: Chhipa, Prakash Chandra, et al.
Published: (2024)

TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning
by: Liu, Jiacheng, et al.
Published: (2025)

Exploiting Point-Language Models with Dual-Prompts for 3D Anomaly Detection
by: Wang, Jiaxiang, et al.
Published: (2025)

Interpreting Object-level Foundation Models via Visual Precision Search
by: Chen, Ruoyu, et al.
Published: (2024)

PointLLM-R: Enhancing 3D Point Cloud Reasoning via Chain-of-Thought
by: Chen, Chaoqi, et al.
Published: (2026)

Applications of Large Scale Foundation Models for Autonomous Driving
by: Huang, Yu, et al.
Published: (2023)

Appearance-Based Refinement for Object-Centric Motion Segmentation
by: Xie, Junyu, et al.
Published: (2023)

Bootstrapping Vision-language Models for Self-supervised Remote Physiological Measurement
by: Yue, Zijie, et al.
Published: (2024)

TinySAM: Pushing the Envelope for Efficient Segment Anything Model
by: Shu, Han, et al.
Published: (2023)

G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
by: Chen, Liang, et al.
Published: (2025)

Generalized Category Discovery via Token Manifold Capacity Learning
by: Tang, Luyao, et al.
Published: (2025)

Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery
by: Al-Emadi, Sara, et al.
Published: (2025)

MF-MOS: A Motion-Focused Model for Moving Object Segmentation
by: Cheng, Jintao, et al.
Published: (2024)

Human-Centric Foundation Models: Perception, Generation and Agentic Modeling
by: Tang, Shixiang, et al.
Published: (2025)

TG-Field: Geometry-Aware Radiative Gaussian Fields for Tomographic Reconstruction
by: Zhong, Yuxiang, et al.
Published: (2026)

IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks
by: Huang, Zitong, et al.
Published: (2024)

SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation
by: Ma, Xiaowen, et al.
Published: (2024)