:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xie, Zequan, Zeng, Weiming, Chen, Yunhua, Ling, Sichang, Chen, Tongyang, Xiao, Jinsheng
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence 68T07 I.2.10
Online Access:	https://arxiv.org/abs/2605.08270
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PathFormer: A Transformer with 3D Grid Constraints for Digital Twin Robot-Arm Trajectory Generation
by: Alanazi, Ahmed, et al.
Published: (2025)

Deep Learning From Routine Histology Improves Risk Stratification for Biochemical Recurrence in Prostate Cancer
by: Grisi, Clément, et al.
Published: (2026)

Banana Ripeness Level Classification using a Simple CNN Model Trained with Real and Synthetic Datasets
by: Chuquimarca, Luis, et al.
Published: (2025)

Predictive Modeling of Maritime Radar Data Using Transformer Architecture
by: Qesaraku, Bjorna, et al.
Published: (2025)

Beyond RGB: Leveraging Vision Transformers for Thermal Weapon Segmentation
by: Kambhatla, Akhila, et al.
Published: (2025)

An Active Learning Pipeline for Biomedical Image Instance Segmentation with Minimal Human Intervention
by: Zhao, Shuo, et al.
Published: (2025)

TowerVision: Understanding and Improving Multilinguality in Vision-Language Models
by: Viveiros, André G., et al.
Published: (2025)

DOD-SA: Infrared-Visible Decoupled Object Detection with Single-Modality Annotations
by: Jin, Hang, et al.
Published: (2025)

Detection of Intracranial Hemorrhage for Trauma Patients
by: Sanner, Antoine P., et al.
Published: (2024)

GIIM: Graph-based Learning of Inter- and Intra-view Dependencies for Multi-view Medical Image Diagnosis
by: Sam, Tran Bao, et al.
Published: (2026)

Conterfactual Generative Zero-Shot Semantic Segmentation
by: Shen, Feihong, et al.
Published: (2021)

Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection
by: Jena, Sushovan, et al.
Published: (2024)

Learning Sign Language Representation using CNN LSTM, 3DCNN, CNN RNN LSTM and CCN TD
by: Louison, Nikita, et al.
Published: (2024)

NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering
by: Huang, Zhihao, et al.
Published: (2025)

Dual-sensing driving detection model
by: K, Leon C. C., et al.
Published: (2025)

CADE 2.5 - ZeResFDG: Frequency-Decoupled, Rescaled and Zero-Projected Guidance for SD/SDXL Latent Diffusion Models
by: Rychkovskiy, Denis
Published: (2025)

From Heuristics to Data: Quantifying Site Planning Layout Indicators with Deep Learning and Multi-Modal Data
by: Cao, Qian, et al.
Published: (2025)

HuMoCon: Concept Discovery for Human Motion Understanding
by: Fang, Qihang, et al.
Published: (2025)

Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs
by: Brothers, Greyson
Published: (2025)

GLL: A Differentiable Graph Learning Layer for Neural Networks
by: Brown, Jason, et al.
Published: (2024)

Advancing Brain Tumor Segmentation via Attention-based 3D U-Net Architecture and Digital Image Processing
by: Gad, Eyad, et al.
Published: (2025)

MB-DSMIL-CL-PL: Scalable Weakly Supervised Ovarian Cancer Subtype Classification and Localisation Using Contrastive and Prototype Learning with Frozen Patch Features
by: Jenkins, Marcus, et al.
Published: (2026)

Anonymization-Enhanced Privacy Protection for Mobile GUI Agents: Available but Invisible
by: Zhao, Lepeng, et al.
Published: (2026)

ShapBPT: Image Feature Attributions Using Data-Aware Binary Partition Trees
by: Rashid, Muhammad, et al.
Published: (2026)

Perception-Consistency Multimodal Large Language Models Reasoning via Caption-Regularized Policy Optimization
by: Tu, Songjun, et al.
Published: (2025)

Autoregressive Medical Image Segmentation via Next-Scale Mask Prediction
by: Chen, Tao, et al.
Published: (2025)

On Memory: A comparison of memory mechanisms in world models
by: Laird, Eli J., et al.
Published: (2025)

LRVS-Fashion: Extending Visual Search with Referring Instructions
by: Lepage, Simon, et al.
Published: (2023)

E Pluribus Unum Interpretable Convolutional Neural Networks
by: Dimas, George, et al.
Published: (2022)

Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations
by: Baharoon, Mohammed, et al.
Published: (2024)

AQFusionNet: Multimodal Deep Learning for Air Quality Index Prediction with Imagery and Sensor Data
by: Kushal, Koushik Ahmed, et al.
Published: (2025)

TableMoE: Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table Understanding
by: Zhang, Junwen, et al.
Published: (2025)

Unpacking Hateful Memes: Presupposed Context and False Claims
by: Cai, Weibin, et al.
Published: (2025)

Comparison of Neural Models for X-ray Image Classification in COVID-19 Detection
by: Togni, Jimi, et al.
Published: (2025)

Self-Attention And Beyond the Infinite: Towards Linear Transformers with Infinite Self-Attention
by: Roffo, Giorgio, et al.
Published: (2026)

Exploring the Capabilities of Large Language Model Encoders for Image-Text Retrieval in Chest X-rays
by: Ko, Hanbin, et al.
Published: (2025)

JVLGS: Joint Vision-Language Gas Leak Segmentation
by: Zhao, Xinlong, et al.
Published: (2025)

Point, Detect, Count: Multi-Task Medical Image Understanding with Instruction-Tuned Vision-Language Models
by: Gautam, Sushant, et al.
Published: (2025)

SCAPE: Searching Conceptual Architecture Prompts using Evolution
by: Lim, Soo Ling, et al.
Published: (2024)

The SAGES Critical View of Safety Challenge: A Global Benchmark for AI-Assisted Surgical Quality Assessment
by: Alapatt, Deepak, et al.
Published: (2025)