:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wei, Yijun, Wang, Jianyu, Xiao, Leping, Shi, Zuoqiang, Fu, Xing, Qiu, Lingyun
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2508.02003
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TransiT: Transient Transformer for Non-line-of-sight Videography
by: Li, Ruiqian, et al.
Published: (2025)

Curvature regularization for Non-line-of-sight Imaging from Under-sampled Data
by: Ding, Rui, et al.
Published: (2023)

Wavelet-based Global Orientation and Surface Reconstruction for Point Clouds
by: Ma, Yueji, et al.
Published: (2025)

Correlation Matching Transformation Transformers for UHD Image Restoration
by: Wang, Cong, et al.
Published: (2024)

Topology preserving Image segmentation using the iterative convolution-thresholding method
by: Deng, Lingyun, et al.
Published: (2025)

NSFW-Classifier Guided Prompt Sanitization for Safe Text-to-Image Generation
by: Xie, Yu, et al.
Published: (2025)

SafeCtrl: Region-Based Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress
by: Zhang, Lingyun, et al.
Published: (2025)

A comprehensive study of time-of-flight non-line-of-sight imaging
by: Marco, Julio, et al.
Published: (2026)

Deep Learning with Data Privacy via Residual Perturbation
by: Tao, Wenqi, et al.
Published: (2024)

EchoPilot: Training-Free Ultrasound Video Segmentation via Scale-Space Semantic Prompting and Reliability-Gated Memory
by: Xiao, Ruiqiang, et al.
Published: (2026)

Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework
by: Xing, Yujie, et al.
Published: (2025)

SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
by: Xing, Zhaohu, et al.
Published: (2024)

Restoring Real-World Images with an Internal Detail Enhancement Diffusion Model
by: Xiao, Peng, et al.
Published: (2025)

Dynamic Memory Transformer for Hyperspectral Image Classification
by: Ahmad, Muhammad
Published: (2025)

Hyper-Local Deformable Transformers for Text Spotting on Historical Maps
by: Lin, Yijun, et al.
Published: (2025)

Memory-efficient Low-latency Remote Photoplethysmography through Temporal-Spatial State Space Duality
by: Wang, Kegang, et al.
Published: (2025)

Chain-of-Talkers (CoTalk): Fast Human Annotation of Dense Image Captions
by: Shen, Yijun, et al.
Published: (2025)

UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
by: Liu, Ropeway, et al.
Published: (2025)

CoMemo: LVLMs Need Image Context with Image Memory
by: Liu, Shi, et al.
Published: (2025)

Motion-aware Memory Network for Fast Video Salient Object Detection
by: Zhao, Xing, et al.
Published: (2022)

Vivim: a Video Vision Mamba for Medical Video Segmentation
by: Yang, Yijun, et al.
Published: (2024)

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training
by: Wei, Tong, et al.
Published: (2025)

Demystify Transformers & Convolutions in Modern Image Deep Networks
by: Hu, Xiaowei, et al.
Published: (2022)

Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization
by: Zhang, Lingyun, et al.
Published: (2024)

SeedEdit 3.0: Fast and High-Quality Generative Image Editing
by: Wang, Peng, et al.
Published: (2025)

GuardT2I: Defending Text-to-Image Models from Adversarial Prompts
by: Yang, Yijun, et al.
Published: (2024)

SCOUT: Fast Spectral CT Imaging in Ultra LOw-data Regimes via PseUdo-label GeneraTion
by: Wei, Guoquan, et al.
Published: (2026)

Transformer-Driven Inverse Problem Transform for Fast Blind Hyperspectral Image Dehazing
by: Tang, Po-Wei, et al.
Published: (2025)

FlexID: Training-Free Flexible Identity Injection via Intent-Aware Modulation for Text-to-Image Generation
by: Li, Guandong, et al.
Published: (2026)

Efficient and Scalable Chinese Vector Font Generation via Component Composition
by: Song, Jinyu, et al.
Published: (2024)

A Fast Text-Driven Approach for Generating Artistic Content
by: Lupascu, Marian, et al.
Published: (2022)

Integrating Sequence and Image Modeling in Irregular Medical Time Series Through Self-Supervised Learning
by: Chen, Liuqing, et al.
Published: (2025)

WeDetect: Fast Open-Vocabulary Object Detection as Retrieval
by: Fu, Shenghao, et al.
Published: (2025)

Towards Fine-Grained Robustness: Attention-Guided Test-Time Prompt Tuning for Vision-Language Models
by: Hai, Jia-Wei, et al.
Published: (2026)

Rethinking Query-based Transformer for Continual Image Segmentation
by: Zhu, Yuchen, et al.
Published: (2025)

Memory-Augmented Dual-Decoder Networks for Multi-Class Unsupervised Anomaly Detection
by: Xing, Jingyu, et al.
Published: (2025)

Hidden in plain sight: VLMs overlook their visual representations
by: Fu, Stephanie, et al.
Published: (2025)

SLAM-Former: Putting SLAM into One Transformer
by: Yuan, Yijun, et al.
Published: (2025)

GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training
by: Wei, Tong, et al.
Published: (2025)

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
by: Chen, Junsong, et al.
Published: (2023)