:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Liangyan, Zhu, Chuang, Chen, Yanxu
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2407.15708
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MV-Swin-T: Mammogram Classification with Multi-view Swin Transformer
by: Sarker, Sushmita, et al.
Published: (2024)

Swin-TUNA : A Novel PEFT Approach for Accurate Food Image Segmentation
by: Chen, Haotian, et al.
Published: (2025)

SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass
by: Meng, Yanxu, et al.
Published: (2025)

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting
by: Chen, Kang, et al.
Published: (2024)

Yuan-TecSwin: A text conditioned Diffusion model with Swin-transformer blocks
by: Wu, Shaohua, et al.
Published: (2025)

SwinIFS: Landmark Guided Swin Transformer For Identity Preserving Face Super Resolution
by: Kausar, Habiba, et al.
Published: (2026)

Barlow-Swin: Toward a novel siamese-based segmentation architecture using Swin-Transformers
by: Haftlang, Morteza Kiani, et al.
Published: (2025)

STSA: Spatial-Temporal Semantic Alignment for Visual Dubbing
by: Ding, Zijun, et al.
Published: (2025)

Text Embedded Swin-UMamba for DeepLesion Segmentation
by: Cheng, Ruida, et al.
Published: (2025)

TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks
by: Zhu, Rui-Jie, et al.
Published: (2022)

SparseSwin: Swin Transformer with Sparse Transformer Block
by: Pinasthika, Krisna, et al.
Published: (2023)

Spatial-Temporal Deep Embedding for Vehicle Trajectory Reconstruction from High-Angle Video
by: D., Tianya T. Zhang Ph., et al.
Published: (2022)

SpikeNVS: Enhancing Novel View Synthesis from Blurry Images via Spike Camera
by: Dai, Gaole, et al.
Published: (2024)

3D Nephrographic Image Synthesis in CT Urography with the Diffusion Model and Swin Transformer
by: Yu, Hongkun, et al.
Published: (2025)

GCA-SUNet: A Gated Context-Aware Swin-UNet for Exemplar-Free Counting
by: Wu, Yuzhe, et al.
Published: (2024)

WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool
by: Li, Zizun, et al.
Published: (2025)

Temporal Reversal Regularization for Spiking Neural Networks: Hybrid Spatio-Temporal Invariance for Generalization
by: Zuo, Lin, et al.
Published: (2024)

RS-FME-SwinT: A Novel Feature Map Enhancement Framework Integrating Customized SwinT with Residual and Spatial CNN for Monkeypox Diagnosis
by: Khan, Saddam Hussain, et al.
Published: (2024)

SpikeVAEDiff: Neural Spike-based Natural Visual Scene Reconstruction via VD-VAE and Versatile Diffusion
by: Li, Jialu, et al.
Published: (2026)

RSwinV2-MD: An Enhanced Residual SwinV2 Transformer for Monkeypox Detection from Skin Images
by: Iqbal, Rashid, et al.
Published: (2026)

Temporal-adaptive Weight Quantization for Spiking Neural Networks
by: Zhang, Han, et al.
Published: (2025)

Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory
by: Wu, Yuqi, et al.
Published: (2025)

SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization
by: Tan, Zhentao, et al.
Published: (2024)

SF-Mamba: Rethinking State Space Model for Vision
by: Yoshimura, Masakazu, et al.
Published: (2026)

uSF: Learning Neural Semantic Field with Uncertainty
by: Skorokhodov, Vsevolod, et al.
Published: (2023)

Leveraging Swin Transformer for Local-to-Global Weakly Supervised Semantic Segmentation
by: Ahmadi, Rozhan, et al.
Published: (2024)

DST-Net: A Dual-Stream Transformer with Illumination-Independent Feature Guidance and Multi-Scale Spatial Convolution for Low-Light Image Enhancement
by: Shi, Yicui, et al.
Published: (2026)

Noisy Label Processing for Classification: A Survey
by: Li, Mengting, et al.
Published: (2024)

CheX-DS: Improving Chest X-ray Image Classification with Ensemble Learning Based on DenseNet and Swin Transformer
by: Li, Xinran, et al.
Published: (2025)

SwinTF3D: A Lightweight Multimodal Fusion Approach for Text-Guided 3D Medical Image Segmentation
by: Khan, Hasan Faraz, et al.
Published: (2025)

Audio-Sync Video Generation with Multi-Stream Temporal Control
by: Weng, Shuchen, et al.
Published: (2025)

StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos
by: Lee, Daeun, et al.
Published: (2025)

Select2Col: Leveraging Spatial-Temporal Importance of Semantic Information for Efficient Collaborative Perception
by: Liu, Yuntao, et al.
Published: (2023)

Unified Spatial-Temporal Edge-Enhanced Graph Networks for Pedestrian Trajectory Prediction
by: Li, Ruochen, et al.
Published: (2025)

SatSwinMAE: Efficient Autoencoding for Multiscale Time-series Satellite Imagery
by: Nakayama, Yohei, et al.
Published: (2024)

DualSwinUnet++: An Enhanced Swin-Unet Architecture With Dual Decoders For PTMC Segmentation
by: Dialameh, Maryam, et al.
Published: (2024)

7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting
by: Gao, Zhongpai, et al.
Published: (2025)

REL-SF4PASS: Panoramic Semantic Segmentation with REL Depth Representation and Spherical Fusion
by: Li, Xuewei, et al.
Published: (2026)

Spatial Transcriptomics as Images for Large-Scale Pretraining
by: Zhu, Yishun, et al.
Published: (2026)

HieraTok: Multi-Scale Visual Tokenizer Improves Image Reconstruction and Generation
by: Chen, Cong, et al.
Published: (2025)