:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pan, Li, Zhang, Yupei, Yang, Qiushi, Li, Tan, Xing, Xiaohan, Yeung, Maximus C. F., Chen, Zhen
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2408.08527
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning
by: Zhang, Yupei, et al.
Published: (2024)

Long-tailed Medical Diagnosis with Relation-aware Representation Learning and Iterative Classifier Calibration
by: Pan, Li, et al.
Published: (2025)

Bridging the Gap between Multi-focus and Multi-modal: A Focused Integration Framework for Multi-modal Image Fusion
by: Li, Xilai, et al.
Published: (2023)

Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
by: Yu, Qifan, et al.
Published: (2024)

DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models
by: Cao, Yuhang, et al.
Published: (2024)

TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs
by: Zhang, Zijian, et al.
Published: (2025)

ProFocus: Proactive Perception and Focused Reasoning in Vision-and-Language Navigation
by: Xue, Wei, et al.
Published: (2026)

Generative Multi-Focus Image Fusion
by: Xie, Xinzhe, et al.
Published: (2025)

Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search
by: Tan, Lei, et al.
Published: (2024)

MaskFocus: Focusing Policy Optimization on Critical Steps for Masked Image Generation
by: Zhang, Guohui, et al.
Published: (2025)

Learning Object Focused Attention
by: Trivedy, Vivek, et al.
Published: (2025)

FocusDiff: Advancing Fine-Grained Text-Image Alignment for Autoregressive Visual Generation through RL
by: Pan, Kaihang, et al.
Published: (2025)

LocoMotion: Learning Motion-Focused Video-Language Representations
by: Doughty, Hazel, et al.
Published: (2024)

Multi-Grained Text-Guided Image Fusion for Multi-Exposure and Multi-Focus Scenarios
by: Tang, Mingwei, et al.
Published: (2025)

An Analysis Focused on Womens Safety: Can VAD Models Be Enhanced by a Multi-modal Dataset?
by: Sangeeta, et al.
Published: (2026)

GaussianFocus: Constrained Attention Focus for 3D Gaussian Splatting
by: Huang, Zexu, et al.
Published: (2025)

Task-Focused Memorization for Multimodal Agents
by: Zou, Tao, et al.
Published: (2026)

DualFocus: Depth from Focus with Spatio-Focal Dual Variational Constraints
by: Woo, Sungmin, et al.
Published: (2025)

FocusDPO: Dynamic Preference Optimization for Multi-Subject Personalized Image Generation via Adaptive Focus
by: Jin, Qiaoqiao, et al.
Published: (2025)

Acknowledging Focus Ambiguity in Visual Questions
by: Chen, Chongyan, et al.
Published: (2025)

Focusing on eye health
by: Linda Nilo Ohr
Published: (2020)

Focusing on eye health
by: Linda Nilo Ohr
Published: (2020)

Focusable Monocular Depth Estimation
by: Du, Yuxin, et al.
Published: (2026)

Taking Notes Brings Focus? Towards Multi-Turn Multimodal Dialogue Learning
by: Liu, Jiazheng, et al.
Published: (2025)

Consensus Focus for Object Detection and minority classes
by: Salgado, Erik Isai Valle, et al.
Published: (2024)

Multi-Focused Video Group Activities Hashing
by: Qi, Zhongmiao, et al.
Published: (2025)

Focus-Consistent Multi-Level Aggregation for Compositional Zero-Shot Learning
by: Dai, Fengyuan, et al.
Published: (2024)

Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation
by: Xing, Xiaoying, et al.
Published: (2025)

DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation
by: Luo, Yueru, et al.
Published: (2024)

Focusing Image Generation to Mitigate Spurious Correlations
by: Li, Xuewei, et al.
Published: (2024)

DINO-VO: Learning Where to Focus for Enhanced State Estimation
by: Chen, Qi, et al.
Published: (2026)

FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders
by: Basu, Soumen, et al.
Published: (2024)

Focused Active Learning for Histopathological Image Classification
by: Schmidt, Arne, et al.
Published: (2024)

FBCIR: Balancing Cross-Modal Focuses in Composed Image Retrieval
by: Zhao, Chenchen, et al.
Published: (2026)

3D Focusing-and-Matching Network for Multi-Instance Point Cloud Registration
by: Zhang, Liyuan, et al.
Published: (2024)

Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment
by: Danish, Muhammad Sohail, et al.
Published: (2024)

A Novel Local Focusing Mechanism for Deepfake Detection Generalization
by: Li, Mingliang, et al.
Published: (2025)

FocusTrack: One-Stage Focus-and-Suppress Framework for 3D Point Cloud Object Tracking
by: Zhou, Sifan, et al.
Published: (2026)

FoR-Net: Learning to Focus on Hard Regions for Efficient Semantic Segmentation
by: Chan, Sheng-Wei, et al.
Published: (2026)

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining
by: Ge, Zhiqi, et al.
Published: (2024)