:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zou, Shun, Zou, Yi, Zhang, Mingya, Luo, Shipeng, Chen, Zhihao, Gao, Guangwei
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2503.11995
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning Dual-Domain Multi-Scale Representations for Single Image Deraining
by: Zou, Shun, et al.
Published: (2025)

MambaMIC: An Efficient Baseline for Microscopic Image Classification with State Space Models
by: Zou, Shun, et al.
Published: (2024)

Cross Paradigm Representation and Alignment Transformer for Image Deraining
by: Zou, Shun, et al.
Published: (2025)

Look, Focus, Act: Efficient and Robust Robot Learning via Human Gaze and Foveated Vision Transformers
by: Chuang, Ian, et al.
Published: (2025)

OCTAMamba: A State-Space Model Approach for Precision OCTA Vasculature Segmentation
by: Zou, Shun, et al.
Published: (2024)

Prefix-Adaptive Block Diffusion for Efficient Document Recognition
by: Chai, Mingxu, et al.
Published: (2026)

SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance
by: Zou, Shun, et al.
Published: (2024)

Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models
by: Zou, Xiaotian
Published: (2025)

Adaptive Residual-Update Steering for Low-Overhead Hallucination Mitigation in Large Vision Language Models
by: Zou, Zhengtao, et al.
Published: (2025)

Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition
by: Gao, Yuefang, et al.
Published: (2024)

From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios
by: Liu, Guoshan, et al.
Published: (2024)

PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval
by: Zou, Qiang, et al.
Published: (2025)

JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search
by: Zou, Dongyun, et al.
Published: (2026)

Exploiting Ensemble Learning for Cross-View Isolated Sign Language Recognition
by: Wang, Fei, et al.
Published: (2025)

GEA: Generation-Enhanced Alignment for Text-to-Image Person Retrieval
by: Zou, Hao, et al.
Published: (2025)

Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction
by: Li, Zhihao, et al.
Published: (2026)

Towards Cross-Scale Attention and Surface Supervision for Fractured Bone Segmentation in CT
by: Zhou, Yu, et al.
Published: (2024)

KGS-GCN: Enhancing Sparse Skeleton Sensing via Kinematics-Driven Gaussian Splatting and Probabilistic Topology for Action Recognition
by: Chen, Yuhan, et al.
Published: (2026)

FAIR-ESI: Feature Adaptive Importance Refinement for Electrophysiological Source Imaging
by: Zou, Linyong, et al.
Published: (2026)

An Evolutionary Network Architecture Search Framework with Adaptive Multimodal Fusion for Hand Gesture Recognition
by: Xia, Yizhang, et al.
Published: (2024)

Mind the Discriminability Trap in Source-Free Cross-domain Few-shot Learning
by: Zhang, Zhenyu, et al.
Published: (2026)

Multi-Head Adaptive Graph Convolution Network for Sparse Point Cloud-Based Human Activity Recognition
by: Zakka, Vincent Gbouna, et al.
Published: (2025)

Decoupling Template Bias in CLIP: Harnessing Empty Prompts for Enhanced Few-Shot Learning
by: Zhang, Zhenyu, et al.
Published: (2025)

Abstracted Gaussian Prototypes for True One-Shot Concept Learning
by: Zou, Chelsea, et al.
Published: (2024)

HIMOSA: Efficient Remote Sensing Image Super-Resolution with Hierarchical Mixture of Sparse Attention
by: Liu, Yi, et al.
Published: (2025)

Training-Free Tunnel Defect Inspection and Engineering Interpretation via Visual Recalibration and Entity Reconstruction
by: Liu, Shipeng, et al.
Published: (2026)

Emotion Recognition Using Transformers with Masked Learning
by: Min, Seongjae, et al.
Published: (2024)

Efficient Video Diffusion with Sparse Information Transmission for Video Compression
by: Zhou, Mingde, et al.
Published: (2026)

SparseWorld: A Flexible, Adaptive, and Efficient 4D Occupancy World Model Powered by Sparse and Dynamic Queries
by: Dang, Chenxu, et al.
Published: (2025)

MPT-PAR:Mix-Parameters Transformer for Panoramic Activity Recognition
by: Gan, Wenqing, et al.
Published: (2024)

Federated Class-Incremental Learning with Prompting
by: Luo, Xin, et al.
Published: (2023)

TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation
by: Huang, Victor Shea-Jay, et al.
Published: (2025)

Label-Efficient Deep Learning in Medical Image Analysis: Challenges and Future Directions
by: Jin, Cheng, et al.
Published: (2023)

Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning
by: Chen, Ruizhe, et al.
Published: (2025)

Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution
by: Xu, Hang, et al.
Published: (2025)

TACFN: Transformer-based Adaptive Cross-modal Fusion Network for Multimodal Emotion Recognition
by: Liu, Feng, et al.
Published: (2025)

DecomPose: Disentangling Cross-Category Optimization Contention for Category-Level 6D Object Pose Estimation
by: Gao, Yifan, et al.
Published: (2026)

Efficient Online Continual Learning in Sensor-Based Human Activity Recognition
by: Zhang, Yao, et al.
Published: (2025)

Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning
by: Devoto, Alessio, et al.
Published: (2024)

Compositional Few-Shot Class-Incremental Learning
by: Zou, Yixiong, et al.
Published: (2024)