:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Yuan, Maoxun, Wei, Xingxing
Formato:	Preprint
Publicado:	2023
Materias:	Computer Vision and Pattern Recognition Multimedia
Acceso en línea:	https://arxiv.org/abs/2306.16175
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection
por: Zhao, Tianyi, et al.
Publicado: (2024)

WaveMamba: Wavelet-Driven Mamba Fusion for RGB-Infrared Object Detection
por: Zhu, Haodong, et al.
Publicado: (2025)

DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection
por: Guo, Junjie, et al.
Publicado: (2024)

Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection
por: Tang, Hao, et al.
Publicado: (2024)

Dual Mutual Learning Network with Global-local Awareness for RGB-D Salient Object Detection
por: Yi, Kang, et al.
Publicado: (2025)

KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection
por: Li, Xingyuan, et al.
Publicado: (2025)

HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection
por: Zeng, YangChen
Publicado: (2025)

Breaking Self-Attention Failure: Rethinking Query Initialization for Infrared Small Target Detection
por: Liu, Yuteng, et al.
Publicado: (2026)

LEAF-Mamba: Local Emphatic and Adaptive Fusion State Space Model for RGB-D Salient Object Detection
por: Wu, Lanhu, et al.
Publicado: (2025)

On the Robustness of Human-Object Interaction Detection against Distribution Shift
por: Xie, Chi, et al.
Publicado: (2025)

UniRGB-IR: A Unified Framework for Visible-Infrared Semantic Tasks via Adapter Tuning
por: Yuan, Maoxun, et al.
Publicado: (2024)

Robust Duality Learning for Unsupervised Visible-Infrared Person Re-Identification
por: Li, Yongxiang, et al.
Publicado: (2025)

Spatial-Temporal Human-Object Interaction Detection
por: Sun, Xu, et al.
Publicado: (2025)

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection
por: Sun, Hao, et al.
Publicado: (2024)

Efficiently Collecting Training Dataset for 2D Object Detection by Online Visual Feedback
por: Kiyokawa, Takuya, et al.
Publicado: (2023)

SFFNet: Synergistic Feature Fusion Network With Dual-Domain Edge Enhancement for UAV Image Object Detection
por: Zhang, Wenfeng, et al.
Publicado: (2026)

Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection
por: Cheung, Tsun-Hin, et al.
Publicado: (2024)

DeepSPG: Exploring Deep Semantic Prior Guidance for Low-light Image Enhancement with Multimodal Learning
por: Lu, Jialang, et al.
Publicado: (2025)

ObjFormer: Learning Land-Cover Changes From Paired OSM Data and Optical High-Resolution Imagery via Object-Guided Transformer
por: Chen, Hongruixuan, et al.
Publicado: (2023)

Other Tokens Matter: Exploring Global and Local Features of Vision Transformers for Object Re-Identification
por: Wang, Yingquan, et al.
Publicado: (2024)

DepthGait: Multi-Scale Cross-Level Feature Fusion of RGB-Derived Depth and Silhouette Sequences for Robust Gait Recognition
por: Li, Xinzhu, et al.
Publicado: (2025)

Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection
por: Gao, Shixuan, et al.
Publicado: (2024)

In Anticipation of Perfect Deepfake: Identity-anchored Artifact-agnostic Detection under Rebalanced Deepfake Detection Protocol
por: Wang, Wei-Han, et al.
Publicado: (2024)

ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding
por: Zhang, Zhenxing, et al.
Publicado: (2024)

Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga
por: Taniguchi, Takara, et al.
Publicado: (2024)

Agent Journey Beyond RGB: Hierarchical Semantic-Spatial Representation Enrichment for Vision-and-Language Navigation
por: Zhang, Xuesong, et al.
Publicado: (2024)

SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection
por: Li, Yuxuan, et al.
Publicado: (2024)

Segmentation-Based Attention Entropy: Detecting and Mitigating Object Hallucinations in Large Vision-Language Models
por: Song, Jiale, et al.
Publicado: (2026)

Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection
por: Sun, Shengyang, et al.
Publicado: (2024)

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
por: Swetha, Sirnam, et al.
Publicado: (2024)

Context Guided Transformer Entropy Modeling for Video Compression
por: Tong, Junlong, et al.
Publicado: (2025)

Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer
por: Luo, Anwei, et al.
Publicado: (2023)

M2ORT: Many-To-One Regression Transformer for Spatial Transcriptomics Prediction from Histopathology Images
por: Wang, Hongyi, et al.
Publicado: (2024)

Calibration & Reconstruction: Deep Integrated Language for Referring Image Segmentation
por: Yan, Yichen, et al.
Publicado: (2024)

Serial Low-rank Adaptation of Vision Transformer
por: Zhong, Houqiang, et al.
Publicado: (2025)

Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration
por: Fazli, Mehrdad, et al.
Publicado: (2025)

DPC-VQA: Decoupling Quality Perception and Residual Calibration for Video Quality Assessment
por: Li, Xinyue, et al.
Publicado: (2026)

Nutrition Estimation for Dietary Management: A Transformer Approach with Depth Sensing
por: Kwan, Zhengyi, et al.
Publicado: (2024)

Rethinking Multi-Modal Object Detection from the Perspective of Mono-Modality Feature Learning
por: Zhao, Tianyi, et al.
Publicado: (2025)

Marker-Based Extrinsic Calibration Method for Accurate Multi-Camera 3D Reconstruction
por: Garcia-D'Urso, Nahuel, et al.
Publicado: (2025)