Saved in:
| Main Authors: | Shi, Xiaoming, Liu, Zeming, Lei, Yiming, Zhang, Chenkai, Leng, Haitao, Wang, Chuan, Liu, Qingjie, Che, Wanxiang, Liu, Shaoguo, Li, Size, Wang, Yunhong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.06899 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding
by: Zhang, Chenkai, et al.
Published: (2025)
by: Zhang, Chenkai, et al.
Published: (2025)
GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art
by: Lei, Yiming, et al.
Published: (2025)
by: Lei, Yiming, et al.
Published: (2025)
ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations
by: Lei, Yiming, et al.
Published: (2025)
by: Lei, Yiming, et al.
Published: (2025)
Mix-Ecom: Towards Mixed-Type E-Commerce Dialogues with Complex Domain Rules
by: Zhou, Chenyu, et al.
Published: (2025)
by: Zhou, Chenyu, et al.
Published: (2025)
STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
by: Wang, Jieyi, et al.
Published: (2024)
by: Wang, Jieyi, et al.
Published: (2024)
Learn More, Forget Less: A Gradient-Aware Data Selection Approach for LLM
by: Liu, Yibai, et al.
Published: (2025)
by: Liu, Yibai, et al.
Published: (2025)
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
by: Huang, Ziyue, et al.
Published: (2025)
by: Huang, Ziyue, et al.
Published: (2025)
A Survey on Data Synthesis and Augmentation for Large Language Models
by: Wang, Ke, et al.
Published: (2024)
by: Wang, Ke, et al.
Published: (2024)
EntroCut: Entropy-Guided Adaptive Truncation for Efficient Chain-of-Thought Reasoning in Small-scale Large Reasoning Models
by: Yan, Hongxi, et al.
Published: (2026)
by: Yan, Hongxi, et al.
Published: (2026)
HIPTrack: Visual Tracking with Historical Prompts
by: Cai, Wenrui, et al.
Published: (2023)
by: Cai, Wenrui, et al.
Published: (2023)
CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding
by: Zhang, Mingming, et al.
Published: (2023)
by: Zhang, Mingming, et al.
Published: (2023)
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking
by: Cai, Wenrui, et al.
Published: (2025)
by: Cai, Wenrui, et al.
Published: (2025)
HiT: Building Mapping with Hierarchical Transformers
by: Zhang, Mingming, et al.
Published: (2023)
by: Zhang, Mingming, et al.
Published: (2023)
Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models
by: Cui, Yiming, et al.
Published: (2021)
by: Cui, Yiming, et al.
Published: (2021)
2M-NER: Contrastive Learning for Multilingual and Multimodal NER with Language and Modal Fusion
by: Wang, Dongsheng, et al.
Published: (2024)
by: Wang, Dongsheng, et al.
Published: (2024)
Semantic Enhanced Few-shot Object Detection
by: Wang, Zheng, et al.
Published: (2024)
by: Wang, Zheng, et al.
Published: (2024)
Reasoning-Driven Anomaly Detection and Localization with Image-Level Supervision
by: Jin, Yizhou, et al.
Published: (2026)
by: Jin, Yizhou, et al.
Published: (2026)
SkeletonX: Data-Efficient Skeleton-based Action Recognition via Cross-sample Feature Aggregation
by: Zhang, Zongye, et al.
Published: (2025)
by: Zhang, Zongye, et al.
Published: (2025)
AttriPrompt: Dynamic Prompt Composition Learning for CLIP
by: Zhan, Qiqi, et al.
Published: (2025)
by: Zhan, Qiqi, et al.
Published: (2025)
Towards Robust and Controllable Text-to-Motion via Masked Autoregressive Diffusion
by: Zhang, Zongye, et al.
Published: (2025)
by: Zhang, Zongye, et al.
Published: (2025)
Incremental Object Detection with CLIP
by: Huang, Ziyue, et al.
Published: (2023)
by: Huang, Ziyue, et al.
Published: (2023)
Lightweight Spatial Embedding for Vision-based 3D Occupancy Prediction
by: Zhang, Jinqing, et al.
Published: (2024)
by: Zhang, Jinqing, et al.
Published: (2024)
MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection
by: Huang, Ziyue, et al.
Published: (2024)
by: Huang, Ziyue, et al.
Published: (2024)
PACF: Prototype Augmented Compact Features for Improving Domain Adaptive Object Detection
by: Liu, Chenguang, et al.
Published: (2025)
by: Liu, Chenguang, et al.
Published: (2025)
Generic Knowledge Boosted Pre-training For Remote Sensing Images
by: Huang, Ziyue, et al.
Published: (2024)
by: Huang, Ziyue, et al.
Published: (2024)
Beyond Open Vocabulary: Multimodal Prompting for Object Detection in Remote Sensing Images
by: Yang, Shuai, et al.
Published: (2026)
by: Yang, Shuai, et al.
Published: (2026)
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
by: Huang, Ziyue, et al.
Published: (2025)
by: Huang, Ziyue, et al.
Published: (2025)
YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images
by: Liu, Chenguang, et al.
Published: (2024)
by: Liu, Chenguang, et al.
Published: (2024)
De-Simplifying Pseudo Labels to Enhancing Domain Adaptive Object Detection
by: Fu, Zehua, et al.
Published: (2025)
by: Fu, Zehua, et al.
Published: (2025)
Deepfake Detection via Knowledge Injection
by: Li, Tonghui, et al.
Published: (2025)
by: Li, Tonghui, et al.
Published: (2025)
LIBERO-X: Robustness Litmus for Vision-Language-Action Models
by: Wang, Guodong, et al.
Published: (2026)
by: Wang, Guodong, et al.
Published: (2026)
MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering
by: Zhang, Xuanliang, et al.
Published: (2025)
by: Zhang, Xuanliang, et al.
Published: (2025)
Kwai Keye-VL Technical Report
by: Kwai Keye Team, et al.
Published: (2025)
by: Kwai Keye Team, et al.
Published: (2025)
Context-Enhanced Detector For Building Detection From Remote Sensing Images
by: Huang, Ziyue, et al.
Published: (2023)
by: Huang, Ziyue, et al.
Published: (2023)
KVQ: Kwai Video Quality Assessment for Short-form Videos
by: Lu, Yiting, et al.
Published: (2024)
by: Lu, Yiting, et al.
Published: (2024)
Medical Dialogue: A Survey of Categories, Methods, Evaluation and Challenges
by: Shi, Xiaoming, et al.
Published: (2024)
by: Shi, Xiaoming, et al.
Published: (2024)
GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object Detection
by: Zhang, Jinqing, et al.
Published: (2024)
by: Zhang, Jinqing, et al.
Published: (2024)
ResWorld: Temporal Residual World Model for End-to-End Autonomous Driving
by: Zhang, Jinqing, et al.
Published: (2026)
by: Zhang, Jinqing, et al.
Published: (2026)
ActiveDC: Distribution Calibration for Active Finetuning
by: Xu, Wenshuai, et al.
Published: (2023)
by: Xu, Wenshuai, et al.
Published: (2023)
Kwai Summary Attention Technical Report
by: Chu, Chenglong, et al.
Published: (2026)
by: Chu, Chenglong, et al.
Published: (2026)
Similar Items
-
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding
by: Zhang, Chenkai, et al.
Published: (2025) -
GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art
by: Lei, Yiming, et al.
Published: (2025) -
ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations
by: Lei, Yiming, et al.
Published: (2025) -
Mix-Ecom: Towards Mixed-Type E-Commerce Dialogues with Complex Domain Rules
by: Zhou, Chenyu, et al.
Published: (2025) -
STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
by: Wang, Jieyi, et al.
Published: (2024)