Saved in:
| Main Authors: | Zhang, Xin, Meng, Dongdong, Li, Sheng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.05897 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal
by: Li, Zhuohao, et al.
Published: (2024)
by: Li, Zhuohao, et al.
Published: (2024)
ActFormer: Scalable Collaborative Perception via Active Queries
by: Huang, Suozhi, et al.
Published: (2024)
by: Huang, Suozhi, et al.
Published: (2024)
Mask4Former: Mask Transformer for 4D Panoptic Segmentation
by: Yilmaz, Kadir, et al.
Published: (2023)
by: Yilmaz, Kadir, et al.
Published: (2023)
Q-Mask: Query-driven Causal Masks for Text Anchoring in OCR-Oriented Vision-Language Models
by: Xu, Longwei, et al.
Published: (2026)
by: Xu, Longwei, et al.
Published: (2026)
SeaFormer++: Squeeze-enhanced Axial Transformer for Mobile Visual Recognition
by: Wan, Qiang, et al.
Published: (2023)
by: Wan, Qiang, et al.
Published: (2023)
TopoMaskV3: 3D Mask Head with Dense Offset and Height Predictions for Road Topology Understanding
by: Kalfaoglu, Muhammet Esat, et al.
Published: (2026)
by: Kalfaoglu, Muhammet Esat, et al.
Published: (2026)
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
by: Meng, Weikang, et al.
Published: (2025)
by: Meng, Weikang, et al.
Published: (2025)
O2Former:Direction-Aware and Multi-Scale Query Enhancement for SAR Ship Instance Segmentation
by: Gao, F., et al.
Published: (2025)
by: Gao, F., et al.
Published: (2025)
Semi-Supervised Multi-Modal Medical Image Segmentation for Complex Situations
by: Meng, Dongdong, et al.
Published: (2025)
by: Meng, Dongdong, et al.
Published: (2025)
LightFormer: A lightweight and efficient decoder for remote sensing image segmentation
by: Chen, Sihang, et al.
Published: (2025)
by: Chen, Sihang, et al.
Published: (2025)
Robust feature knowledge distillation for enhanced performance of lightweight crack segmentation models
by: Chen, Zhaohui, et al.
Published: (2024)
by: Chen, Zhaohui, et al.
Published: (2024)
TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision
by: Zhai, Yukun, et al.
Published: (2023)
by: Zhai, Yukun, et al.
Published: (2023)
RayFormer: Improving Query-Based Multi-Camera 3D Object Detection via Ray-Centric Strategies
by: Chu, Xiaomeng, et al.
Published: (2024)
by: Chu, Xiaomeng, et al.
Published: (2024)
Prior2Former -- Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation
by: Schmidt, Sebastian, et al.
Published: (2025)
by: Schmidt, Sebastian, et al.
Published: (2025)
Vision-guided and Mask-enhanced Adaptive Denoising for Prompt-based Image Editing
by: Wang, Kejie, et al.
Published: (2024)
by: Wang, Kejie, et al.
Published: (2024)
EchoMask: Speech-Queried Attention-based Mask Modeling for Holistic Co-Speech Motion Generation
by: Zhang, Xiangyue, et al.
Published: (2025)
by: Zhang, Xiangyue, et al.
Published: (2025)
PEM: Prototype-based Efficient MaskFormer for Image Segmentation
by: Cavagnero, Niccolò, et al.
Published: (2024)
by: Cavagnero, Niccolò, et al.
Published: (2024)
QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects
by: Ismayilzada, Elkhan, et al.
Published: (2025)
by: Ismayilzada, Elkhan, et al.
Published: (2025)
CurveFormer++: 3D Lane Detection by Curve Propagation with Temporal Curve Queries and Attention
by: Bai, Yifeng, et al.
Published: (2024)
by: Bai, Yifeng, et al.
Published: (2024)
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
by: Pak, Byeonghyun, et al.
Published: (2024)
by: Pak, Byeonghyun, et al.
Published: (2024)
Seg2Any: Open-set Segmentation-Mask-to-Image Generation with Precise Shape and Semantic Control
by: Li, Danfeng, et al.
Published: (2025)
by: Li, Danfeng, et al.
Published: (2025)
Deep learning based infrared small object segmentation: Challenges and future directions
by: Yang, Zhengeng, et al.
Published: (2025)
by: Yang, Zhengeng, et al.
Published: (2025)
Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression Recognition
by: Li, Meng-zhu, et al.
Published: (2025)
by: Li, Meng-zhu, et al.
Published: (2025)
Multi-organ segmentation: a progressive exploration of learning paradigms under scarce annotation
by: Li, Shiman, et al.
Published: (2023)
by: Li, Shiman, et al.
Published: (2023)
AuthFormer: Adaptive Multimodal biometric authentication transformer for middle-aged and elderly people
by: rui, Yang, et al.
Published: (2024)
by: rui, Yang, et al.
Published: (2024)
CmFNet: Cross-modal Fusion Network for Weakly-supervised Segmentation of Medical Images
by: Meng, Dongdong, et al.
Published: (2025)
by: Meng, Dongdong, et al.
Published: (2025)
CalibFormer: A Transformer-based Automatic LiDAR-Camera Calibration Network
by: Xiao, Yuxuan, et al.
Published: (2023)
by: Xiao, Yuxuan, et al.
Published: (2023)
Multi-label Sewer Pipe Defect Recognition with Mask Attention Feature Enhancement and Label Correlation Learning
by: Zuo, Xin, et al.
Published: (2024)
by: Zuo, Xin, et al.
Published: (2024)
SmartEraser: Remove Anything from Images using Masked-Region Guidance
by: Jiang, Longtao, et al.
Published: (2025)
by: Jiang, Longtao, et al.
Published: (2025)
Multi-Masked Querying Network for Robust Emotion Recognition from Incomplete Multi-Modal Physiological Signals
by: Xu, Geng-Xin, et al.
Published: (2025)
by: Xu, Geng-Xin, et al.
Published: (2025)
CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation
by: Ma, Xiaochuan, et al.
Published: (2025)
by: Ma, Xiaochuan, et al.
Published: (2025)
Semi-supervised 2D Human Pose Estimation via Adaptive Keypoint Masking
by: Meng, Kexin, et al.
Published: (2024)
by: Meng, Kexin, et al.
Published: (2024)
Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training
by: Liu, Haowei, et al.
Published: (2024)
by: Liu, Haowei, et al.
Published: (2024)
FocusDiT: Masking Queries in Diffusion Transformers for Fine-grained Image Generation
by: Fang, Xueji, et al.
Published: (2026)
by: Fang, Xueji, et al.
Published: (2026)
Comparing YOLOv8 and Mask R-CNN for instance segmentation in complex orchard environments
by: Sapkota, Ranjan, et al.
Published: (2023)
by: Sapkota, Ranjan, et al.
Published: (2023)
MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation
by: Rong, Fu, et al.
Published: (2025)
by: Rong, Fu, et al.
Published: (2025)
MOSMOS: Multi-organ segmentation facilitated by medical report supervision
by: Tian, Weiwei, et al.
Published: (2024)
by: Tian, Weiwei, et al.
Published: (2024)
Label-efficient multi-organ segmentation with a diffusion model
by: Huang, Yongzhi, et al.
Published: (2024)
by: Huang, Yongzhi, et al.
Published: (2024)
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
by: He, Chenhang, et al.
Published: (2024)
by: He, Chenhang, et al.
Published: (2024)
MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks
by: Liu, Yifei, et al.
Published: (2024)
by: Liu, Yifei, et al.
Published: (2024)
Similar Items
-
ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal
by: Li, Zhuohao, et al.
Published: (2024) -
ActFormer: Scalable Collaborative Perception via Active Queries
by: Huang, Suozhi, et al.
Published: (2024) -
Mask4Former: Mask Transformer for 4D Panoptic Segmentation
by: Yilmaz, Kadir, et al.
Published: (2023) -
Q-Mask: Query-driven Causal Masks for Text Anchoring in OCR-Oriented Vision-Language Models
by: Xu, Longwei, et al.
Published: (2026) -
SeaFormer++: Squeeze-enhanced Axial Transformer for Mobile Visual Recognition
by: Wan, Qiang, et al.
Published: (2023)