Saved in:
| Main Authors: | Shu, Zishan, Wu, Juntong, Yan, Wei, Liu, Xudong, Zhang, Hongyu, Liu, Chang, Mao, Youdong, Chen, Jie |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.08602 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WaveFormer: A Lightweight Transformer Model for sEMG-based Gesture Recognition
by: Chen, Yanlong, et al.
Published: (2025)
by: Chen, Yanlong, et al.
Published: (2025)
WaveFormer: A 3D Transformer with Wavelet-Driven Feature Representation for Efficient Medical Image Segmentation
by: Hasan, Md Mahfuz Al, et al.
Published: (2025)
by: Hasan, Md Mahfuz Al, et al.
Published: (2025)
LoFormer: Local Frequency Transformer for Image Deblurring
by: Mao, Xintian, et al.
Published: (2024)
by: Mao, Xintian, et al.
Published: (2024)
DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision
by: Liao, Jiashu, et al.
Published: (2025)
by: Liao, Jiashu, et al.
Published: (2025)
MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context
by: Gu, Zishan, et al.
Published: (2024)
by: Gu, Zishan, et al.
Published: (2024)
Spatio-temporal Decoupled Knowledge Compensator for Few-Shot Action Recognition
by: Qu, Hongyu, et al.
Published: (2026)
by: Qu, Hongyu, et al.
Published: (2026)
Transformer-Based Person Search with High-Frequency Augmentation and Multi-Wave Mixing
by: Shu, Qilin, et al.
Published: (2025)
by: Shu, Qilin, et al.
Published: (2025)
EasyVFX: Frequency-Driven Decoupling for Resource-Efficient VFX Generation
by: Ma, Yue, et al.
Published: (2026)
by: Ma, Yue, et al.
Published: (2026)
Decoupling Continual Semantic Segmentation
by: Guo, Yifu, et al.
Published: (2025)
by: Guo, Yifu, et al.
Published: (2025)
Wave-Former: Through-Occlusion 3D Reconstruction via Wireless Shape Completion
by: Dodds, Laura, et al.
Published: (2025)
by: Dodds, Laura, et al.
Published: (2025)
FreqEdit: Preserving High-Frequency Features for Robust Multi-Turn Image Editing
by: Liao, Yucheng, et al.
Published: (2025)
by: Liao, Yucheng, et al.
Published: (2025)
Exploring Invariance in Images through One-way Wave Equations
by: Chen, Yinpeng, et al.
Published: (2023)
by: Chen, Yinpeng, et al.
Published: (2023)
RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models
by: Liao, Zijun, et al.
Published: (2025)
by: Liao, Zijun, et al.
Published: (2025)
FreqTrack: Frequency Learning based Vision Transformer for RGB-Event Object Tracking
by: You, Jinlin, et al.
Published: (2026)
by: You, Jinlin, et al.
Published: (2026)
Holmes: Towards Effective and Harmless Model Ownership Verification to Personalized Large Vision Models via Decoupling Common Features
by: Zhu, Linghui, et al.
Published: (2025)
by: Zhu, Linghui, et al.
Published: (2025)
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
by: Ma, Xiaowen, et al.
Published: (2024)
by: Ma, Xiaowen, et al.
Published: (2024)
PartFormer: Awakening Latent Diverse Representation from Vision Transformer for Object Re-Identification
by: Tan, Lei, et al.
Published: (2024)
by: Tan, Lei, et al.
Published: (2024)
Bootstrapping SparseFormers from Vision Foundation Models
by: Gao, Ziteng, et al.
Published: (2023)
by: Gao, Ziteng, et al.
Published: (2023)
AIA: Rethinking Architecture Decoupling Strategy In Unified Multimodal Model
by: Zheng, Dian, et al.
Published: (2025)
by: Zheng, Dian, et al.
Published: (2025)
WaveDM: Wavelet-Based Diffusion Models for Image Restoration
by: Huang, Yi, et al.
Published: (2023)
by: Huang, Yi, et al.
Published: (2023)
Decoupled Residual Denoising Diffusion Models for Unified and Data Efficient Image-to-Image Translation
by: Lin, Ziyue, et al.
Published: (2026)
by: Lin, Ziyue, et al.
Published: (2026)
WaveFace: Authentic Face Restoration with Efficient Frequency Recovery
by: Miao, Yunqi, et al.
Published: (2024)
by: Miao, Yunqi, et al.
Published: (2024)
VistaFormer: Scalable Vision Transformers for Satellite Image Time Series Segmentation
by: MacDonald, Ezra, et al.
Published: (2024)
by: MacDonald, Ezra, et al.
Published: (2024)
PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
by: Li, Xiangtai, et al.
Published: (2023)
by: Li, Xiangtai, et al.
Published: (2023)
MetaFormer Baselines for Vision
by: Yu, Weihao, et al.
Published: (2022)
by: Yu, Weihao, et al.
Published: (2022)
SonoVision: A Computer Vision Approach for Helping Visually Challenged Individuals Locate Objects with the Help of Sound Cues
by: Zishan, Md Abu Obaida, et al.
Published: (2025)
by: Zishan, Md Abu Obaida, et al.
Published: (2025)
Bayesian Test-Time Adaptation for Vision-Language Models
by: Zhou, Lihua, et al.
Published: (2025)
by: Zhou, Lihua, et al.
Published: (2025)
VISTA-Bench: Do Vision-Language Models Really Understand Visualized Text as Well as Pure Text?
by: Liu, Qing'an, et al.
Published: (2026)
by: Liu, Qing'an, et al.
Published: (2026)
Rascene: High-Fidelity 3D Scene Imaging with mmWave Communication Signals
by: Song, Kunzhe, et al.
Published: (2026)
by: Song, Kunzhe, et al.
Published: (2026)
DAP: Doppler-aware Point Network for Heterogeneous mmWave Action Recognition
by: Lin, Jiaying, et al.
Published: (2026)
by: Lin, Jiaying, et al.
Published: (2026)
Automatic Neuronal Activity Segmentation in Fast Four Dimensional Spatio-Temporal Fluorescence Imaging using Bayesian Approach
by: Li, Ran, et al.
Published: (2025)
by: Li, Ran, et al.
Published: (2025)
Hierarchical Dual-Subspace Decoupling for Continual Learning in Vision-Language Models
by: Qin, Mengxin, et al.
Published: (2026)
by: Qin, Mengxin, et al.
Published: (2026)
FDDet: Frequency-Decoupling for Boundary Refinement in Temporal Action Detection
by: Zhu, Xinnan, et al.
Published: (2025)
by: Zhu, Xinnan, et al.
Published: (2025)
Decoupled and Interactive Regression Modeling for High-performance One-stage 3D Object Detection
by: Xiao, Weiping, et al.
Published: (2024)
by: Xiao, Weiping, et al.
Published: (2024)
PainFormer: a Vision Foundation Model for Automatic Pain Assessment
by: Gkikas, Stefanos, et al.
Published: (2025)
by: Gkikas, Stefanos, et al.
Published: (2025)
RoadFormer+: Delivering RGB-X Scene Parsing through Scale-Aware Information Decoupling and Advanced Heterogeneous Feature Fusion
by: Huang, Jianxin, et al.
Published: (2024)
by: Huang, Jianxin, et al.
Published: (2024)
Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction
by: Wu, Yuan, et al.
Published: (2024)
by: Wu, Yuan, et al.
Published: (2024)
WaveSFNet: A Wavelet-Based Codec and Spatial--Frequency Dual-Domain Gating Network for Spatiotemporal Prediction
by: Cai, Xinyong, et al.
Published: (2026)
by: Cai, Xinyong, et al.
Published: (2026)
High-Resolution Underwater Camouflaged Object Detection: GBU-UCOD Dataset and Topology-Aware and Frequency-Decoupled Networks
by: Wu, Wenji, et al.
Published: (2026)
by: Wu, Wenji, et al.
Published: (2026)
WeedRepFormer: Reparameterizable Vision Transformers for Real-Time Waterhemp Segmentation and Gender Classification
by: Sarker, Toqi Tahamid, et al.
Published: (2026)
by: Sarker, Toqi Tahamid, et al.
Published: (2026)
Similar Items
-
WaveFormer: A Lightweight Transformer Model for sEMG-based Gesture Recognition
by: Chen, Yanlong, et al.
Published: (2025) -
WaveFormer: A 3D Transformer with Wavelet-Driven Feature Representation for Efficient Medical Image Segmentation
by: Hasan, Md Mahfuz Al, et al.
Published: (2025) -
LoFormer: Local Frequency Transformer for Image Deblurring
by: Mao, Xintian, et al.
Published: (2024) -
DisentangleFormer: Spatial-Channel Decoupling for Multi-Channel Vision
by: Liao, Jiashu, et al.
Published: (2025) -
MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context
by: Gu, Zishan, et al.
Published: (2024)