Saved in:
| Main Authors: | Gao, Junyu, Zhang, Da, Wang, Qiyu, Zhao, Zhiyuan, Li, Xuelong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.13992 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation
by: Li, Bingyu, et al.
Published: (2025)
by: Li, Bingyu, et al.
Published: (2025)
U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation
by: Li, Bingyu, et al.
Published: (2024)
by: Li, Bingyu, et al.
Published: (2024)
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
by: Li, Bingyu, et al.
Published: (2024)
by: Li, Bingyu, et al.
Published: (2024)
MARIS: Marine Open-Vocabulary Instance Segmentation with Geometric Enhancement and Semantic Alignment
by: Li, Bingyu, et al.
Published: (2025)
by: Li, Bingyu, et al.
Published: (2025)
Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing
by: Li, Bingyu, et al.
Published: (2025)
by: Li, Bingyu, et al.
Published: (2025)
Exploring the Underwater World Segmentation without Extra Training
by: Li, Bingyu, et al.
Published: (2025)
by: Li, Bingyu, et al.
Published: (2025)
An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation
by: Li, Bingyu, et al.
Published: (2026)
by: Li, Bingyu, et al.
Published: (2026)
Prototype-Based Low Altitude UAV Semantic Segmentation
by: Zhang, Da, et al.
Published: (2026)
by: Zhang, Da, et al.
Published: (2026)
Towards Realistic Open-Vocabulary Remote Sensing Segmentation: Benchmark and Baseline
by: Li, Bingyu, et al.
Published: (2026)
by: Li, Bingyu, et al.
Published: (2026)
SVGen: Interpretable Vector Graphics Generation with Large Language Models
by: Wang, Feiyu, et al.
Published: (2025)
by: Wang, Feiyu, et al.
Published: (2025)
UWBench: A Comprehensive Vision-Language Benchmark for Underwater Understanding
by: Zhang, Da, et al.
Published: (2025)
by: Zhang, Da, et al.
Published: (2025)
NWPU-MOC: A Benchmark for Fine-grained Multi-category Object Counting in Aerial Images
by: Gao, Junyu, et al.
Published: (2024)
by: Gao, Junyu, et al.
Published: (2024)
Boosting Quantitive and Spatial Awareness for Zero-Shot Object Counting
by: Zhang, Da, et al.
Published: (2026)
by: Zhang, Da, et al.
Published: (2026)
Do MLLMs Really See It: Reinforcing Visual Attention in Multimodal LLMs
by: Ou, Siqu, et al.
Published: (2026)
by: Ou, Siqu, et al.
Published: (2026)
IntroSVG: Learning from Rendering Feedback for Text-to-SVG Generation via an Introspective Generator-Critic Framework
by: Wang, Feiyu, et al.
Published: (2026)
by: Wang, Feiyu, et al.
Published: (2026)
Real-Time Crowd Counting for Embedded Systems with Lightweight Architecture
by: Zhao, Zhiyuan, et al.
Published: (2025)
by: Zhao, Zhiyuan, et al.
Published: (2025)
Exploring Scale Shift in Crowd Localization under the Context of Domain Generalization
by: Wang, Juncheng, et al.
Published: (2025)
by: Wang, Juncheng, et al.
Published: (2025)
From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
by: Dai, Muzhi, et al.
Published: (2025)
by: Dai, Muzhi, et al.
Published: (2025)
One-Shot Crowd Counting With Density Guidance For Scene Adaptation
by: Chen, Jiwei, et al.
Published: (2026)
by: Chen, Jiwei, et al.
Published: (2026)
Quantum-inspired Interpretable Deep Learning Architecture for Text Sentiment Analysis
by: Li, Bingyu, et al.
Published: (2024)
by: Li, Bingyu, et al.
Published: (2024)
Single Domain Generalization for Crowd Counting
by: Peng, Zhuoxuan, et al.
Published: (2024)
by: Peng, Zhuoxuan, et al.
Published: (2024)
ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation
by: Lan, Mengcheng, et al.
Published: (2024)
by: Lan, Mengcheng, et al.
Published: (2024)
Embedding Generalized Semantic Knowledge into Few-Shot Remote Sensing Segmentation
by: Jia, Yuyu, et al.
Published: (2024)
by: Jia, Yuyu, et al.
Published: (2024)
Granular Ball Guided Stable Latent Domain Discovery for Domain-General Crowd Counting
by: Chen, Fan, et al.
Published: (2026)
by: Chen, Fan, et al.
Published: (2026)
DiffProxy: Multi-View Human Mesh Recovery via Diffusion-Generated Dense Proxies
by: Wang, Renke, et al.
Published: (2026)
by: Wang, Renke, et al.
Published: (2026)
Regressor-Segmenter Mutual Prompt Learning for Crowd Counting
by: Guo, Mingyue, et al.
Published: (2023)
by: Guo, Mingyue, et al.
Published: (2023)
Boosting SAM for Cross-Domain Few-Shot Segmentation via Conditional Point Sparsification
by: Nie, Jiahao, et al.
Published: (2026)
by: Nie, Jiahao, et al.
Published: (2026)
Proxy Denoising for Source-Free Domain Adaptation
by: Tang, Song, et al.
Published: (2024)
by: Tang, Song, et al.
Published: (2024)
Frequency Domain Nuances Mining for Visible-Infrared Person Re-identification
by: Zhang, Yukang, et al.
Published: (2024)
by: Zhang, Yukang, et al.
Published: (2024)
Domain Game: Disentangle Anatomical Feature for Single Domain Generalized Segmentation
by: Chen, Hao, et al.
Published: (2024)
by: Chen, Hao, et al.
Published: (2024)
AHAP: Reconstructing Arbitrary Humans from Arbitrary Perspectives with Geometric Priors
by: Qiao, Xiaozhen, et al.
Published: (2026)
by: Qiao, Xiaozhen, et al.
Published: (2026)
SamLP: A Customized Segment Anything Model for License Plate Detection
by: Ding, Haoxuan, et al.
Published: (2024)
by: Ding, Haoxuan, et al.
Published: (2024)
Referring Video Object Segmentation with Cross-Modality Proxy Queries
by: Sun, Baoli, et al.
Published: (2025)
by: Sun, Baoli, et al.
Published: (2025)
HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos
by: Xiong, Weitao, et al.
Published: (2025)
by: Xiong, Weitao, et al.
Published: (2025)
Towards Diverse Binary Segmentation via A Simple yet General Gated Network
by: Zhao, Xiaoqi, et al.
Published: (2023)
by: Zhao, Xiaoqi, et al.
Published: (2023)
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis
by: Li, Junyu, et al.
Published: (2024)
by: Li, Junyu, et al.
Published: (2024)
Ranking-based Adaptive Query Generation for DETRs in Crowded Pedestrian Detection
by: Gao, Feng, et al.
Published: (2023)
by: Gao, Feng, et al.
Published: (2023)
HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation
by: Jing, Linglin, et al.
Published: (2024)
by: Jing, Linglin, et al.
Published: (2024)
DyCrowd: Towards Dynamic Crowd Reconstruction from a Large-scene Video
by: Wen, Hao, et al.
Published: (2025)
by: Wen, Hao, et al.
Published: (2025)
Open-Vocabulary Domain Generalization in Urban-Scene Segmentation
by: Zhao, Dong, et al.
Published: (2026)
by: Zhao, Dong, et al.
Published: (2026)
Similar Items
-
FGAseg: Fine-Grained Pixel-Text Alignment for Open-Vocabulary Semantic Segmentation
by: Li, Bingyu, et al.
Published: (2025) -
U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation
by: Li, Bingyu, et al.
Published: (2024) -
StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
by: Li, Bingyu, et al.
Published: (2024) -
MARIS: Marine Open-Vocabulary Instance Segmentation with Geometric Enhancement and Semantic Alignment
by: Li, Bingyu, et al.
Published: (2025) -
Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing
by: Li, Bingyu, et al.
Published: (2025)