Saved in:
| Main Authors: | Huang, Xian-Hong, Su, Hui-Kai, Sun, Chi-Chia, Hsieh, Jun-Wei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.05474 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RepSFNet : A Single Fusion Network with Structural Reparameterization for Crowd Counting
by: Achmadiah, Mas Nurul, et al.
Published: (2026)
by: Achmadiah, Mas Nurul, et al.
Published: (2026)
Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving
by: Setyawan, Novendra, et al.
Published: (2025)
by: Setyawan, Novendra, et al.
Published: (2025)
TinyFormer: Preserving Tiny Objects in YOLO-DETR Hybrid Real-time Detectors
by: Hsieh, Jun-Wei, et al.
Published: (2026)
by: Hsieh, Jun-Wei, et al.
Published: (2026)
RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images
by: Jiang, Xiaozheng, et al.
Published: (2025)
by: Jiang, Xiaozheng, et al.
Published: (2025)
MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing
by: Huang, Yu-Fen, et al.
Published: (2024)
by: Huang, Yu-Fen, et al.
Published: (2024)
COMO: Cross-Mamba Interaction and Offset-Guided Fusion for Multimodal Object Detection
by: Liu, Chang, et al.
Published: (2024)
by: Liu, Chang, et al.
Published: (2024)
AUV-Fusion: Cross-Modal Adversarial Fusion of User Interactions and Visual Perturbations Against VARS
by: Ling, Hai, et al.
Published: (2025)
by: Ling, Hai, et al.
Published: (2025)
COXNet: Cross-Layer Fusion with Adaptive Alignment and Scale Integration for RGBT Tiny Object Detection
by: Peng, Peiran, et al.
Published: (2025)
by: Peng, Peiran, et al.
Published: (2025)
DQ-DETR: DETR with Dynamic Query for Tiny Object Detection
by: Huang, Yi-Xin, et al.
Published: (2024)
by: Huang, Yi-Xin, et al.
Published: (2024)
A DeNoising FPN With Transformer R-CNN for Tiny Object Detection
by: Liu, Hou-I, et al.
Published: (2024)
by: Liu, Hou-I, et al.
Published: (2024)
SONAR: Semantic-Object Navigation with Aggregated Reasoning through a Cross-Modal Inference Paradigm
by: Wang, Yao, et al.
Published: (2025)
by: Wang, Yao, et al.
Published: (2025)
InterFusion: Text-Driven Generation of 3D Human-Object Interaction
by: Dai, Sisi, et al.
Published: (2024)
by: Dai, Sisi, et al.
Published: (2024)
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder
by: Dai, Yusheng, et al.
Published: (2023)
by: Dai, Yusheng, et al.
Published: (2023)
Integrating Object Detection Modality into Visual Language Model for Enhanced Autonomous Driving Agent
by: He, Linfeng, et al.
Published: (2024)
by: He, Linfeng, et al.
Published: (2024)
Interacting Null Sources in Different Geometries
by: Hsieh, Chia-Li
Published: (2024)
by: Hsieh, Chia-Li
Published: (2024)
MicroViTv2: Beyond the FLOPS for Edge Energy-Friendly Vision Transformers
by: Setyawan, Novendra, et al.
Published: (2026)
by: Setyawan, Novendra, et al.
Published: (2026)
FaceLiVTv2: An Improved Hybrid Architecture for Efficient Mobile Face Recognition
by: Setyawan, Novendra, et al.
Published: (2026)
by: Setyawan, Novendra, et al.
Published: (2026)
FaceLiVT: Face Recognition using Linear Vision Transformer with Structural Reparameterization For Mobile Device
by: Setyawan, Novendra, et al.
Published: (2025)
by: Setyawan, Novendra, et al.
Published: (2025)
MicroViT: A Vision Transformer with Low Complexity Self Attention for Edge Device
by: Setyawan, Novendra, et al.
Published: (2025)
by: Setyawan, Novendra, et al.
Published: (2025)
Contrast-Guided Cross-Modal Distillation for Thermal Object Detection
by: Kim, SiWoo, et al.
Published: (2025)
by: Kim, SiWoo, et al.
Published: (2025)
Energy-Efficient Fast Object Detection on Edge Devices for IoT Systems
by: Achmadiah, Mas Nurul, et al.
Published: (2026)
by: Achmadiah, Mas Nurul, et al.
Published: (2026)
Dual-Domain Homogeneous Fusion with Cross-Modal Mamba and Progressive Decoder for 3D Object Detection
by: Hu, Xuzhong, et al.
Published: (2025)
by: Hu, Xuzhong, et al.
Published: (2025)
Thermal-Det: Language-Guided Cross-Modal Distillation for Open-Vocabulary Thermal Object Detection
by: Ranasinghe, Yasiru, et al.
Published: (2026)
by: Ranasinghe, Yasiru, et al.
Published: (2026)
HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection
by: Gu, Zijian, et al.
Published: (2024)
by: Gu, Zijian, et al.
Published: (2024)
MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects
by: Fan, Lei, et al.
Published: (2024)
by: Fan, Lei, et al.
Published: (2024)
RCTDistill: Cross-Modal Knowledge Distillation Framework for Radar-Camera 3D Object Detection with Temporal Fusion
by: Bang, Geonho, et al.
Published: (2025)
by: Bang, Geonho, et al.
Published: (2025)
Visual Decision‐Making in Early Childhood Nutrition: Taiwanese Parents′ Infant Formula Choices via Eye‐Tracking and Hierarchical Decision Modeling
by: Chia-Yen Hsieh
Published: (2026)
by: Chia-Yen Hsieh
Published: (2026)
High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects
by: Xue, Jialong, et al.
Published: (2025)
by: Xue, Jialong, et al.
Published: (2025)
Seg the HAB: Language-Guided Geospatial Algae Bloom Reasoning and Segmentation
by: Hsieh, Patterson, et al.
Published: (2025)
by: Hsieh, Patterson, et al.
Published: (2025)
STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification
by: Xu, Xingguo, et al.
Published: (2026)
by: Xu, Xingguo, et al.
Published: (2026)
Cross-Modal Bottleneck Fusion For Noise Robust Audio-Visual Speech Recognition
by: Ok, Seaone, et al.
Published: (2026)
by: Ok, Seaone, et al.
Published: (2026)
Similarity Distance-Based Label Assignment for Tiny Object Detection
by: Shi, Shuohao, et al.
Published: (2024)
by: Shi, Shuohao, et al.
Published: (2024)
Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery
by: Zhao, Zhicheng, et al.
Published: (2025)
by: Zhao, Zhicheng, et al.
Published: (2025)
UFO-DETR: Frequency-Guided End-to-End Detector for UAV Tiny Objects
by: Chen, Yuankai, et al.
Published: (2026)
by: Chen, Yuankai, et al.
Published: (2026)
ParFormer: A Vision Transformer with Parallel Mixer and Sparse Channel Attention Patch Embedding
by: Setyawan, Novendra, et al.
Published: (2024)
by: Setyawan, Novendra, et al.
Published: (2024)
Cross-Modal Purification and Fusion for Small-Object RGB-D Transmission-Line Defect Detection
by: Cui, Jiaming, et al.
Published: (2026)
by: Cui, Jiaming, et al.
Published: (2026)
HiddenObject: Modality-Agnostic Fusion for Multimodal Hidden Object Detection
by: Song, Harris, et al.
Published: (2025)
by: Song, Harris, et al.
Published: (2025)
Cross-modal Offset-guided Dynamic Alignment and Fusion for Weakly Aligned UAV Object Detection
by: Zongzhen, Liu, et al.
Published: (2025)
by: Zongzhen, Liu, et al.
Published: (2025)
VIFO: Visual Feature Empowered Multivariate Time Series Forecasting with Cross-Modal Fusion
by: Wang, Yanlong, et al.
Published: (2025)
by: Wang, Yanlong, et al.
Published: (2025)
Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric Fusion
by: Liu, Jiangyuan, et al.
Published: (2025)
by: Liu, Jiangyuan, et al.
Published: (2025)
Similar Items
-
RepSFNet : A Single Fusion Network with Structural Reparameterization for Crowd Counting
by: Achmadiah, Mas Nurul, et al.
Published: (2026) -
Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving
by: Setyawan, Novendra, et al.
Published: (2025) -
TinyFormer: Preserving Tiny Objects in YOLO-DETR Hybrid Real-time Detectors
by: Hsieh, Jun-Wei, et al.
Published: (2026) -
RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images
by: Jiang, Xiaozheng, et al.
Published: (2025) -
MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing
by: Huang, Yu-Fen, et al.
Published: (2024)