Guardado en:
| Autores principales: | Chang, Hao, Wang, Zhihui, Wu, Lingxiang, An, Wei, Li, Boyang, Lin, Zaiping, Sheng, Weidong, Wang, Jinqiao |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2601.19640 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Enhancing Chain of Thought Prompting in Large Language Models via Reasoning Patterns
por: Zhang, Yufeng, et al.
Publicado: (2024)
por: Zhang, Yufeng, et al.
Publicado: (2024)
Australia's Wellbeing Framework: Is It Really ‘Measuring What Matters’?
por: Kate Sollis, et al.
Publicado: (2025)
por: Kate Sollis, et al.
Publicado: (2025)
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
por: Yu, Jiachen, et al.
Publicado: (2025)
por: Yu, Jiachen, et al.
Publicado: (2025)
Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection
por: Ma, Xingyu, et al.
Publicado: (2024)
por: Ma, Xingyu, et al.
Publicado: (2024)
Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models
por: Zhan, Yufei, et al.
Publicado: (2024)
por: Zhan, Yufei, et al.
Publicado: (2024)
Rethinking Household Food Waste: What Really Matters in Everyday Food Management
por: Lucie Veselá, et al.
Publicado: (2026)
por: Lucie Veselá, et al.
Publicado: (2026)
Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Discern Causal Links Across Modalities
por: Li, Zhiyuan, et al.
Publicado: (2024)
por: Li, Zhiyuan, et al.
Publicado: (2024)
What Really Matters in Matrix-Whitening Optimizers?
por: Frans, Kevin, et al.
Publicado: (2025)
por: Frans, Kevin, et al.
Publicado: (2025)
PLUME: Latent Reasoning Based Universal Multimodal Embedding
por: He, Chenwei, et al.
Publicado: (2026)
por: He, Chenwei, et al.
Publicado: (2026)
PFDM: Parser-Free Virtual Try-on via Diffusion Model
por: Niu, Yunfang, et al.
Publicado: (2024)
por: Niu, Yunfang, et al.
Publicado: (2024)
What do Blind and Low-Vision People Really Want from Assistive Smart Devices? Comparison of the Literature with a Focus Study
por: Gamage, Bhanuka, et al.
Publicado: (2025)
por: Gamage, Bhanuka, et al.
Publicado: (2025)
Do LLMs Really Think Step-by-step In Implicit Reasoning?
por: Yu, Yijiong
Publicado: (2024)
por: Yu, Yijiong
Publicado: (2024)
Testing Autonomous Driving Systems -- What Really Matters and What Doesn't
por: Li, Changwen, et al.
Publicado: (2025)
por: Li, Changwen, et al.
Publicado: (2025)
Can MLLMs Reason Beyond Language? VisReason: A Comprehensive Benchmark for Vision-Centric Reasoning
por: Guo, Longteng, et al.
Publicado: (2026)
por: Guo, Longteng, et al.
Publicado: (2026)
Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines
por: Ying, Xinyi, et al.
Publicado: (2024)
por: Ying, Xinyi, et al.
Publicado: (2024)
Focus on What Matters: Enhancing Medical Vision-Language Models with Automatic Attention Alignment Tuning
por: Chang, Aofei, et al.
Publicado: (2025)
por: Chang, Aofei, et al.
Publicado: (2025)
Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought
por: Man, Yunze, et al.
Publicado: (2025)
por: Man, Yunze, et al.
Publicado: (2025)
Coordinated Beamforming for RIS-Empowered ISAC Systems over Secure Low-Altitude Networks
por: Wang, Chunjie, et al.
Publicado: (2025)
por: Wang, Chunjie, et al.
Publicado: (2025)
RRCANet: Recurrent Reusable-Convolution Attention Network for Infrared Small Target Detection
por: Liu, Yongxian, et al.
Publicado: (2025)
por: Liu, Yongxian, et al.
Publicado: (2025)
AnyDesign: Versatile Area Fashion Editing via Mask-Free Diffusion
por: Niu, Yunfang, et al.
Publicado: (2024)
por: Niu, Yunfang, et al.
Publicado: (2024)
LINK: Adaptive Modality Interaction for Audio-Visual Video Parsing
por: Wang, Langyu, et al.
Publicado: (2024)
por: Wang, Langyu, et al.
Publicado: (2024)
CoNav: Collaborative Cross-Modal Reasoning for Embodied Navigation
por: Hao, Haihong, et al.
Publicado: (2025)
por: Hao, Haihong, et al.
Publicado: (2025)
Shared Sky, Shared Spectrum: Coordinated Satellite-5G Networks for Low-Altitude Economy
por: Wang, Yanmin, et al.
Publicado: (2026)
por: Wang, Yanmin, et al.
Publicado: (2026)
What Really is Commonsense Knowledge?
por: Do, Quyet V., et al.
Publicado: (2024)
por: Do, Quyet V., et al.
Publicado: (2024)
MathSight: A Benchmark Exploring Have Vision-Language Models Really Seen in University-Level Mathematical Reasoning?
por: Wang, Yuandong, et al.
Publicado: (2025)
por: Wang, Yuandong, et al.
Publicado: (2025)
Efficient Coordination Among Chinese Provinces in Managing Supply and Demand for Staple Crops
por: Yifei Wang, et al.
Publicado: (2024)
por: Yifei Wang, et al.
Publicado: (2024)
Vision-Centric Activation and Coordination for Multimodal Large Language Models
por: Wang, Yunnan, et al.
Publicado: (2025)
por: Wang, Yunnan, et al.
Publicado: (2025)
Towards the Vision-Sound-Language-Action Paradigm: The HEAR Framework for Sound-Centric Manipulation
por: Nie, Chang, et al.
Publicado: (2026)
por: Nie, Chang, et al.
Publicado: (2026)
Mapping Degeneration Meets Label Evolution: Learning Infrared Small Target Detection with Single Point Supervision
por: Ying, Xinyi, et al.
Publicado: (2023)
por: Ying, Xinyi, et al.
Publicado: (2023)
What Really Matters for Robust Multi-Sensor HD Map Construction?
por: Hao, Xiaoshuai, et al.
Publicado: (2025)
por: Hao, Xiaoshuai, et al.
Publicado: (2025)
What Really Matters for Learning-based LiDAR-Camera Calibration
por: Huang, Shujuan, et al.
Publicado: (2025)
por: Huang, Shujuan, et al.
Publicado: (2025)
Earnings Quality and ESG Performance in Energy and Utilities: What Really Matters?
por: Antonios Persakis, et al.
Publicado: (2025)
por: Antonios Persakis, et al.
Publicado: (2025)
Does the Question Really Matter? Training-Free Data Selection for Vision-Language SFT
por: Sun, Peng, et al.
Publicado: (2026)
por: Sun, Peng, et al.
Publicado: (2026)
Data-Centric AI Governance: Addressing the Limitations of Model-Focused Policies
por: Gupta, Ritwik, et al.
Publicado: (2024)
por: Gupta, Ritwik, et al.
Publicado: (2024)
The Things That Really Matter
Publicado: (2022)
Publicado: (2022)
Topology-Aware Coordination for Multi-Functional Low-Altitude Wireless Networks
por: He, Jiajun, et al.
Publicado: (2026)
por: He, Jiajun, et al.
Publicado: (2026)
Robust Reasoning via Dynamic Token Selection for Distribution-Aligned Self-Distillation
por: Zhang, Ruiqi, et al.
Publicado: (2026)
por: Zhang, Ruiqi, et al.
Publicado: (2026)
Focusing on What Matters: Object-Agent-centric Tokenization for Vision Language Action models
por: Bendikas, Rokas, et al.
Publicado: (2025)
por: Bendikas, Rokas, et al.
Publicado: (2025)
PixCLIP: Achieving Fine-grained Visual Language Understanding via Any-granularity Pixel-Text Alignment Learning
por: Xiao, Yicheng, et al.
Publicado: (2025)
por: Xiao, Yicheng, et al.
Publicado: (2025)
Agentic AI for Low-Altitude Semantic Wireless Networks: An Energy Efficient Design
por: Zhao, Zhouxiang, et al.
Publicado: (2025)
por: Zhao, Zhouxiang, et al.
Publicado: (2025)
Ejemplares similares
-
Enhancing Chain of Thought Prompting in Large Language Models via Reasoning Patterns
por: Zhang, Yufeng, et al.
Publicado: (2024) -
Australia's Wellbeing Framework: Is It Really ‘Measuring What Matters’?
por: Kate Sollis, et al.
Publicado: (2025) -
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
por: Yu, Jiachen, et al.
Publicado: (2025) -
Enhancing Text-to-SQL Capabilities of Large Language Models via Domain Database Knowledge Injection
por: Ma, Xingyu, et al.
Publicado: (2024) -
Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models
por: Zhan, Yufei, et al.
Publicado: (2024)