:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Huo, Da, Kastner, Marc A., Liu, Tingwei, Kawanishi, Yasutomo, Hirayama, Takatsugu, Komamizu, Takahiro, Ide, Ichiro
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computer Vision and Pattern Recognition
Accesso online:	https://arxiv.org/abs/2511.22310
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Tracking Small Birds by Detection Candidate Region Filtering and Detection History-aware Association
di: Liu, Tingwei, et al.
Pubblicazione: (2024)

One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features
di: Nguyen, Trung Thanh, et al.
Pubblicazione: (2024)

Action Selection Learning for Multi-label Multi-view Action Recognition
di: Nguyen, Trung Thanh, et al.
Pubblicazione: (2024)

MultiTSF: Transformer-based Sensor Fusion for Human-Centric Multi-view and Multi-modal Action Recognition
di: Nguyen, Trung Thanh, et al.
Pubblicazione: (2025)

MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion
di: Nguyen, Trung Thanh, et al.
Pubblicazione: (2025)

View-aware Cross-modal Distillation for Multi-view Action Recognition
di: Nguyen, Trung Thanh, et al.
Pubblicazione: (2025)

ForestMamba: Sparse Mamba with Geometry-guided Queries for 3D Forest Point Cloud Segmentation
di: Nguyen, Trung Thanh, et al.
Pubblicazione: (2026)

Q-Adapter: Visual Query Adapter for Extracting Textually-related Features in Video Captioning
di: Chen, Junan, et al.
Pubblicazione: (2025)

Multi-proposal Collaboration and Multi-task Training for Weakly-supervised Video Moment Retrieval
di: Zhang, Bolin, et al.
Pubblicazione: (2026)

Investigating Conceptual Blending of a Diffusion Model for Improving Nonword-to-Image Generation
di: Matsuhira, Chihaya, et al.
Pubblicazione: (2024)

Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception
di: John, Vijay, et al.
Pubblicazione: (2024)

Static and Dynamic Graph Alignment Network for Temporal Video Grounding
di: Hu, Zhanjie, et al.
Pubblicazione: (2026)

FROSS: Faster-than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images
di: Hou, Hao-Yu, et al.
Pubblicazione: (2025)

A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese Questions
di: Inadumi, Shun, et al.
Pubblicazione: (2024)

Class-agnostic 3D Segmentation by Granularity-Consistent Automatic 2D Mask Tracking
di: Wang, Juan, et al.
Pubblicazione: (2025)

Action Motifs: Self-Supervised Hierarchical Representation of Human Body Movements
di: Kinoshita, Genki, et al.
Pubblicazione: (2026)

CBAM-SwinT-BL: Small Rail Surface Defect Detection Method Based on Swin Transformer with Block Level CBAM Enhancement
di: Zhao, Jiayi, et al.
Pubblicazione: (2024)

EA-Swin: An Embedding-Agnostic Swin Transformer for AI-Generated Video Detection
di: Mai, Hung, et al.
Pubblicazione: (2026)

REACH: Hand Pose Estimation from Room Corners
di: Nakamura, Shu, et al.
Pubblicazione: (2026)

SAM-Swin: SAM-Driven Dual-Swin Transformers with Adaptive Lesion Enhancement for Laryngo-Pharyngeal Tumor Detection
di: Wei, Jia, et al.
Pubblicazione: (2024)

DarSwin: Distortion Aware Radial Swin Transformer
di: Athwale, Akshaya, et al.
Pubblicazione: (2023)

Classifying Deepfakes Using Swin Transformers
di: Xi, Aprille J., et al.
Pubblicazione: (2025)

SparseSwin: Swin Transformer with Sparse Transformer Block
di: Pinasthika, Krisna, et al.
Pubblicazione: (2023)

MV-Swin-T: Mammogram Classification with Multi-view Swin Transformer
di: Sarker, Sushmita, et al.
Pubblicazione: (2024)

DB SwinT: A Dual-Branch Swin Transformer Network for Road Extraction in Optical Remote Sensing Imagery
di: He, Zongyang, et al.
Pubblicazione: (2026)

SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
di: Wang, Yonghui, et al.
Pubblicazione: (2024)

Cross-DINO: Cross the Deep MLP and Transformer for Small Object Detection
di: Cao, Guiping, et al.
Pubblicazione: (2025)

SwinTextUNet: Integrating CLIP-Based Text Guidance into Swin Transformer U-Nets for Medical Image Segmentation
di: Yeafi, Ashfak, et al.
Pubblicazione: (2026)

GeoFormer: A Lightweight Swin Transformer for Joint Building Height and Footprint Estimation from Sentinel Imagery
di: Jinzhen, Han, et al.
Pubblicazione: (2026)

Brain Hematoma Marker Recognition Using Multitask Learning: SwinTransformer and Swin-Unet
di: Hirata, Kodai, et al.
Pubblicazione: (2025)

SwinIFS: Landmark Guided Swin Transformer For Identity Preserving Face Super Resolution
di: Kausar, Habiba, et al.
Pubblicazione: (2026)

Flash Window Attention: speedup the attention computation for Swin Transformer
di: Zhang, Zhendong
Pubblicazione: (2025)

Barlow-Swin: Toward a novel siamese-based segmentation architecture using Swin-Transformers
di: Haftlang, Morteza Kiani, et al.
Pubblicazione: (2025)

Enhancing Image Authenticity Detection: Swin Transformers and Color Frame Analysis for CGI vs. Real Images
di: Mehta, Preeti, et al.
Pubblicazione: (2024)

Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images
di: Du, Zewen, et al.
Pubblicazione: (2024)

Video Frame Interpolation for Polarization via Swin-Transformer
di: Huang, Feng, et al.
Pubblicazione: (2024)

PT-DETR: Small Target Detection Based on Partially-Aware Detail Focus
di: Huo, Bingcong, et al.
Pubblicazione: (2025)

MIC-BEV: Multi-Infrastructure Camera Bird's-Eye-View Transformer with Relation-Aware Fusion for 3D Object Detection
di: Zhang, Yun, et al.
Pubblicazione: (2025)

When Swin Transformer Meets KANs: An Improved Transformer Architecture for Medical Image Segmentation
di: Sapkota, Nishchal, et al.
Pubblicazione: (2025)

A Flying Bird Object Detection Method for Surveillance Video
di: Sun, Ziwei, et al.
Pubblicazione: (2024)