:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Liu, Jiazhen, Feng, Mingkuan, Chen, Long
Format:	Preprint
Publié:	2025
Sujets:	Computer Vision and Pattern Recognition
Accès en ligne:	https://arxiv.org/abs/2512.00395
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

Segmentation as A Plug-and-Play Capability for Frozen Multimodal LLMs
par: Liu, Jiazhen, et autres
Publié: (2025)

Better with Less: Tackling Heterogeneous Multi-Modal Image Joint Pretraining via Conditioned and Degraded Masked Autoencoder
par: Peng, Bowen, et autres
Publié: (2026)

Point Transformer V3: Simpler, Faster, Stronger
par: Wu, Xiaoyang, et autres
Publié: (2023)

MLLM-based Textual Explanations for Face Comparison
par: Sony, Redwan, et autres
Publié: (2026)

Contrastive Masked Autoencoders are Stronger Vision Learners
par: Huang, Zhicheng, et autres
Publié: (2022)

Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
par: Pak, Byeonghyun, et autres
Publié: (2024)

Stronger is not better: Better Augmentations in Contrastive Learning for Medical Image Segmentation
par: Idris, Azeez, et autres
Publié: (2025)

MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images
par: Wang, Ke-Lei, et autres
Publié: (2024)

Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance
par: Lin, Liting, et autres
Publié: (2024)

Empowering Small VLMs to Think with Dynamic Memorization and Exploration
par: Liu, Jiazhen, et autres
Publié: (2025)

Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
par: Zhuo, Le, et autres
Publié: (2024)

Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation
par: Li, Yongkang, et autres
Publié: (2024)

Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
par: Ye, Zilyu, et autres
Publié: (2024)

Instance-aware Image Colorization with Controllable Textual Descriptions and Segmentation Masks
par: An, Yanru, et autres
Publié: (2025)

Decoding by Perturbation: Mitigating MLLM Hallucinations via Dynamic Textual Perturbation
par: Jia, Sihang, et autres
Publié: (2026)

Striving for Faster and Better: A One-Layer Architecture with Auto Re-parameterization for Low-Light Image Enhancement
par: An, Nan, et autres
Publié: (2025)

Medal S: Spatio-Textual Prompt Model for Medical Segmentation
par: Shi, Pengcheng, et autres
Publié: (2025)

Rethinking MLLM Itself as a Segmenter with a Single Segmentation Token
par: Zhang, Anqi, et autres
Publié: (2026)

Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
par: Ma, Xiaoxiao, et autres
Publié: (2025)

Fundus-R1: Training a Fundus-Reading MLLM with Knowledge-Aware Reasoning on Public Data
par: Deng, Yuchuan, et autres
Publié: (2026)

Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
par: Wei, Zhixiang, et autres
Publié: (2023)

Faster and Stronger: When ANN-SNN Conversion Meets Parallel Spiking Calculation
par: Hao, Zecheng, et autres
Publié: (2024)

MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
par: Chu, Xiangxiang, et autres
Publié: (2024)

Augment to Segment: Tackling Pixel-Level Imbalance in Wheat Disease and Pest Segmentation
par: Wei, Tianqi, et autres
Publié: (2025)

MaskMed: Decoupled Mask and Class Prediction for Medical Image Segmentation
par: Xie, Bin, et autres
Publié: (2025)

GMT: Guided Mask Transformer for Leaf Instance Segmentation
par: Chen, Feng, et autres
Publié: (2024)

Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models
par: Gupta, Sharut, et autres
Publié: (2025)

Faster Diffusion Action Segmentation
par: Wang, Shuaibing, et autres
Publié: (2024)

Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions
par: Fu, Yuhan, et autres
Publié: (2024)

Faster and Better 3D Splatting via Group Training
par: Wang, Chengbo, et autres
Publié: (2024)

SimGen: A Diffusion-Based Framework for Simultaneous Surgical Image and Segmentation Mask Generation
par: Bhat, Aditya, et autres
Publié: (2025)

Moment and Highlight Detection via MLLM Frame Segmentation
par: Jiwanta, I Putu Andika Bagas, et autres
Publié: (2025)

Explore More, Learn Better: Parallel MLLM Embeddings under Mutual Information Minimization
par: Wang, Zhicheng, et autres
Publié: (2025)

Decoupled Seg Tokens Make Stronger Reasoning Video Segmenter and Grounder
par: Jisheng, Dang, et autres
Publié: (2025)

Towards Energy-Efficiency by Navigating the Trilemma of Energy, Latency, and Accuracy
par: Tian, Boyuan, et autres
Publié: (2024)

FlashMesh: Faster and Better Autoregressive Mesh Synthesis via Structured Speculation
par: Shen, Tingrui, et autres
Publié: (2025)

Enhanced Generative Data Augmentation for Semantic Segmentation via Stronger Guidance
par: Che, Quang-Huy, et autres
Publié: (2024)

Medical Referring Image Segmentation via Next-Token Mask Prediction
par: Chen, Xinyu, et autres
Publié: (2025)

ANTS: Adaptive Negative Textual Space Shaping for OOD Detection via Test-Time MLLM Understanding and Reasoning
par: Zhu, Wenjie, et autres
Publié: (2025)

Nuanced Emotion Recognition Based on a Segment-based MLLM Framework Leveraging Qwen3-Omni for AH Detection
par: Tang, Liang, et autres
Publié: (2026)