Saved in:
| Main Authors: | Wei, Yijun, Wang, Jianyu, Xiao, Leping, Shi, Zuoqiang, Fu, Xing, Qiu, Lingyun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.02003 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TransiT: Transient Transformer for Non-line-of-sight Videography
by: Li, Ruiqian, et al.
Published: (2025)
by: Li, Ruiqian, et al.
Published: (2025)
Curvature regularization for Non-line-of-sight Imaging from Under-sampled Data
by: Ding, Rui, et al.
Published: (2023)
by: Ding, Rui, et al.
Published: (2023)
Wavelet-based Global Orientation and Surface Reconstruction for Point Clouds
by: Ma, Yueji, et al.
Published: (2025)
by: Ma, Yueji, et al.
Published: (2025)
Correlation Matching Transformation Transformers for UHD Image Restoration
by: Wang, Cong, et al.
Published: (2024)
by: Wang, Cong, et al.
Published: (2024)
Topology preserving Image segmentation using the iterative convolution-thresholding method
by: Deng, Lingyun, et al.
Published: (2025)
by: Deng, Lingyun, et al.
Published: (2025)
NSFW-Classifier Guided Prompt Sanitization for Safe Text-to-Image Generation
by: Xie, Yu, et al.
Published: (2025)
by: Xie, Yu, et al.
Published: (2025)
SafeCtrl: Region-Based Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress
by: Zhang, Lingyun, et al.
Published: (2025)
by: Zhang, Lingyun, et al.
Published: (2025)
A comprehensive study of time-of-flight non-line-of-sight imaging
by: Marco, Julio, et al.
Published: (2026)
by: Marco, Julio, et al.
Published: (2026)
Deep Learning with Data Privacy via Residual Perturbation
by: Tao, Wenqi, et al.
Published: (2024)
by: Tao, Wenqi, et al.
Published: (2024)
EchoPilot: Training-Free Ultrasound Video Segmentation via Scale-Space Semantic Prompting and Reliability-Gated Memory
by: Xiao, Ruiqiang, et al.
Published: (2026)
by: Xiao, Ruiqiang, et al.
Published: (2026)
Unifying and Enhancing Graph Transformers via a Hierarchical Mask Framework
by: Xing, Yujie, et al.
Published: (2025)
by: Xing, Yujie, et al.
Published: (2025)
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
by: Xing, Zhaohu, et al.
Published: (2024)
by: Xing, Zhaohu, et al.
Published: (2024)
Restoring Real-World Images with an Internal Detail Enhancement Diffusion Model
by: Xiao, Peng, et al.
Published: (2025)
by: Xiao, Peng, et al.
Published: (2025)
Dynamic Memory Transformer for Hyperspectral Image Classification
by: Ahmad, Muhammad
Published: (2025)
by: Ahmad, Muhammad
Published: (2025)
Hyper-Local Deformable Transformers for Text Spotting on Historical Maps
by: Lin, Yijun, et al.
Published: (2025)
by: Lin, Yijun, et al.
Published: (2025)
Memory-efficient Low-latency Remote Photoplethysmography through Temporal-Spatial State Space Duality
by: Wang, Kegang, et al.
Published: (2025)
by: Wang, Kegang, et al.
Published: (2025)
Chain-of-Talkers (CoTalk): Fast Human Annotation of Dense Image Captions
by: Shen, Yijun, et al.
Published: (2025)
by: Shen, Yijun, et al.
Published: (2025)
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback
by: Liu, Ropeway, et al.
Published: (2025)
by: Liu, Ropeway, et al.
Published: (2025)
CoMemo: LVLMs Need Image Context with Image Memory
by: Liu, Shi, et al.
Published: (2025)
by: Liu, Shi, et al.
Published: (2025)
Motion-aware Memory Network for Fast Video Salient Object Detection
by: Zhao, Xing, et al.
Published: (2022)
by: Zhao, Xing, et al.
Published: (2022)
Vivim: a Video Vision Mamba for Medical Video Segmentation
by: Yang, Yijun, et al.
Published: (2024)
by: Yang, Yijun, et al.
Published: (2024)
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training
by: Wei, Tong, et al.
Published: (2025)
by: Wei, Tong, et al.
Published: (2025)
Demystify Transformers & Convolutions in Modern Image Deep Networks
by: Hu, Xiaowei, et al.
Published: (2022)
by: Hu, Xiaowei, et al.
Published: (2022)
Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization
by: Zhang, Lingyun, et al.
Published: (2024)
by: Zhang, Lingyun, et al.
Published: (2024)
SeedEdit 3.0: Fast and High-Quality Generative Image Editing
by: Wang, Peng, et al.
Published: (2025)
by: Wang, Peng, et al.
Published: (2025)
GuardT2I: Defending Text-to-Image Models from Adversarial Prompts
by: Yang, Yijun, et al.
Published: (2024)
by: Yang, Yijun, et al.
Published: (2024)
SCOUT: Fast Spectral CT Imaging in Ultra LOw-data Regimes via PseUdo-label GeneraTion
by: Wei, Guoquan, et al.
Published: (2026)
by: Wei, Guoquan, et al.
Published: (2026)
Transformer-Driven Inverse Problem Transform for Fast Blind Hyperspectral Image Dehazing
by: Tang, Po-Wei, et al.
Published: (2025)
by: Tang, Po-Wei, et al.
Published: (2025)
FlexID: Training-Free Flexible Identity Injection via Intent-Aware Modulation for Text-to-Image Generation
by: Li, Guandong, et al.
Published: (2026)
by: Li, Guandong, et al.
Published: (2026)
Efficient and Scalable Chinese Vector Font Generation via Component Composition
by: Song, Jinyu, et al.
Published: (2024)
by: Song, Jinyu, et al.
Published: (2024)
A Fast Text-Driven Approach for Generating Artistic Content
by: Lupascu, Marian, et al.
Published: (2022)
by: Lupascu, Marian, et al.
Published: (2022)
Integrating Sequence and Image Modeling in Irregular Medical Time Series Through Self-Supervised Learning
by: Chen, Liuqing, et al.
Published: (2025)
by: Chen, Liuqing, et al.
Published: (2025)
WeDetect: Fast Open-Vocabulary Object Detection as Retrieval
by: Fu, Shenghao, et al.
Published: (2025)
by: Fu, Shenghao, et al.
Published: (2025)
Towards Fine-Grained Robustness: Attention-Guided Test-Time Prompt Tuning for Vision-Language Models
by: Hai, Jia-Wei, et al.
Published: (2026)
by: Hai, Jia-Wei, et al.
Published: (2026)
Rethinking Query-based Transformer for Continual Image Segmentation
by: Zhu, Yuchen, et al.
Published: (2025)
by: Zhu, Yuchen, et al.
Published: (2025)
Memory-Augmented Dual-Decoder Networks for Multi-Class Unsupervised Anomaly Detection
by: Xing, Jingyu, et al.
Published: (2025)
by: Xing, Jingyu, et al.
Published: (2025)
Hidden in plain sight: VLMs overlook their visual representations
by: Fu, Stephanie, et al.
Published: (2025)
by: Fu, Stephanie, et al.
Published: (2025)
SLAM-Former: Putting SLAM into One Transformer
by: Yuan, Yijun, et al.
Published: (2025)
by: Yuan, Yijun, et al.
Published: (2025)
GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training
by: Wei, Tong, et al.
Published: (2025)
by: Wei, Tong, et al.
Published: (2025)
PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
by: Chen, Junsong, et al.
Published: (2023)
by: Chen, Junsong, et al.
Published: (2023)
Similar Items
-
TransiT: Transient Transformer for Non-line-of-sight Videography
by: Li, Ruiqian, et al.
Published: (2025) -
Curvature regularization for Non-line-of-sight Imaging from Under-sampled Data
by: Ding, Rui, et al.
Published: (2023) -
Wavelet-based Global Orientation and Surface Reconstruction for Point Clouds
by: Ma, Yueji, et al.
Published: (2025) -
Correlation Matching Transformation Transformers for UHD Image Restoration
by: Wang, Cong, et al.
Published: (2024) -
Topology preserving Image segmentation using the iterative convolution-thresholding method
by: Deng, Lingyun, et al.
Published: (2025)