Saved in:
| Main Authors: | Xia, Shiyu, Xiong, Junyu, Dong, Haoyu, Zhao, Jianbo, Tian, Yuzhang, Zhou, Mengyu, He, Yeye, Han, Shi, Zhang, Dongmei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.16234 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
by: Dong, Haoyu, et al.
Published: (2024)
by: Dong, Haoyu, et al.
Published: (2024)
TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models
by: He, Xinyi, et al.
Published: (2025)
by: He, Xinyi, et al.
Published: (2025)
Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning
by: Xing, Junjie, et al.
Published: (2024)
by: Xing, Junjie, et al.
Published: (2024)
MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
by: Xing, Junjie, et al.
Published: (2025)
by: Xing, Junjie, et al.
Published: (2025)
SheetBrain: A Neuro-Symbolic Agent for Accurate Reasoning over Complex and Large Spreadsheets
by: Wang, Ziwei, et al.
Published: (2025)
by: Wang, Ziwei, et al.
Published: (2025)
Jupiter: Enhancing LLM Data Analysis Capabilities via Notebook and Inference-Time Value-Guided Search
by: Li, Shuocheng, et al.
Published: (2025)
by: Li, Shuocheng, et al.
Published: (2025)
Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study
by: Sui, Yuan, et al.
Published: (2023)
by: Sui, Yuan, et al.
Published: (2023)
TablePilot: Recommending Human-Preferred Tabular Data Analysis with Large Language Models
by: Yi, Deyin, et al.
Published: (2025)
by: Yi, Deyin, et al.
Published: (2025)
Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations
by: Chen, Sibei, et al.
Published: (2024)
by: Chen, Sibei, et al.
Published: (2024)
Causality-guided Prompt Learning for Vision-language Models via Visual Granulation
by: Gao, Mengyu, et al.
Published: (2025)
by: Gao, Mengyu, et al.
Published: (2025)
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
by: Zhao, Shanshan, et al.
Published: (2025)
by: Zhao, Shanshan, et al.
Published: (2025)
GeoWorld-VLM: Geometry from World Models for Vision-Language Models
by: Gu, Renjie, et al.
Published: (2026)
by: Gu, Renjie, et al.
Published: (2026)
UWBench: A Comprehensive Vision-Language Benchmark for Underwater Understanding
by: Zhang, Da, et al.
Published: (2025)
by: Zhang, Da, et al.
Published: (2025)
LongVLM: Efficient Long Video Understanding via Large Language Models
by: Weng, Yuetian, et al.
Published: (2024)
by: Weng, Yuetian, et al.
Published: (2024)
Understanding Degradation with Vision Language Model
by: Lan, Guanzhou, et al.
Published: (2026)
by: Lan, Guanzhou, et al.
Published: (2026)
Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities
by: Yan, Xu, et al.
Published: (2024)
by: Yan, Xu, et al.
Published: (2024)
LaCoVL-FER: Landmark-Guided Contrastive Learning Network with Vision-Language Enhancement for Facial Expression Recognition
by: Wang, Jiaxin, et al.
Published: (2026)
by: Wang, Jiaxin, et al.
Published: (2026)
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
by: Hong, Wenyi, et al.
Published: (2025)
by: Hong, Wenyi, et al.
Published: (2025)
Structural Graph Probing of Vision-Language Models
by: He, Haoyu, et al.
Published: (2026)
by: He, Haoyu, et al.
Published: (2026)
VLA-Thinker: Boosting Vision-Language-Action Models through Thinking-with-Image Reasoning
by: Wang, Chaoyang, et al.
Published: (2026)
by: Wang, Chaoyang, et al.
Published: (2026)
SuperRL: Reinforcement Learning with Supervision to Boost Language Model Reasoning
by: Liu, Yihao, et al.
Published: (2025)
by: Liu, Yihao, et al.
Published: (2025)
SheetDesigner: MLLM-Powered Spreadsheet Layout Generation with Rule-Based and Vision-Based Reflection
by: Chen, Qin, et al.
Published: (2025)
by: Chen, Qin, et al.
Published: (2025)
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
by: Liu, Hongbo, et al.
Published: (2025)
by: Liu, Hongbo, et al.
Published: (2025)
Vision-Based Anti Unmanned Aerial Technology: Opportunities and Challenges
by: Ding, Guanghai, et al.
Published: (2025)
by: Ding, Guanghai, et al.
Published: (2025)
Large Vision-Language Models Get Lost in Attention
by: Xi, Gongli, et al.
Published: (2026)
by: Xi, Gongli, et al.
Published: (2026)
SpatialBot: Precise Spatial Understanding with Vision Language Models
by: Cai, Wenxiao, et al.
Published: (2024)
by: Cai, Wenxiao, et al.
Published: (2024)
Mema: Memory-Augmented Adapter for Enhanced Vision-Language Understanding
by: Liu, Ying, et al.
Published: (2026)
by: Liu, Ying, et al.
Published: (2026)
Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models
by: Mohamud, Safaa Abdullahi Moallim, et al.
Published: (2025)
by: Mohamud, Safaa Abdullahi Moallim, et al.
Published: (2025)
Spatio-Temporal Foundation Models: Vision, Challenges, and Opportunities
by: Goodge, Adam, et al.
Published: (2025)
by: Goodge, Adam, et al.
Published: (2025)
Supplementing Missing Visions via Dialog for Scene Graph Generations
by: Zhao, Zhenghao, et al.
Published: (2022)
by: Zhao, Zhenghao, et al.
Published: (2022)
PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning
by: Zou, Jiaru, et al.
Published: (2024)
by: Zou, Jiaru, et al.
Published: (2024)
EventSTU: Event-Guided Efficient Spatio-Temporal Understanding for Video Large Language Models
by: Xu, Wenhao, et al.
Published: (2025)
by: Xu, Wenhao, et al.
Published: (2025)
Large Language Models and Foundation Models in Smart Agriculture: Basics, Opportunities, and Challenges
by: Li, Jiajia, et al.
Published: (2023)
by: Li, Jiajia, et al.
Published: (2023)
EVLM: An Efficient Vision-Language Model for Visual Understanding
by: Chen, Kaibing, et al.
Published: (2024)
by: Chen, Kaibing, et al.
Published: (2024)
Image Gradient-Aided Photometric Stereo Network
by: Wang, Kaixuan, et al.
Published: (2024)
by: Wang, Kaixuan, et al.
Published: (2024)
Vision-Language Models Do Not Understand Negation
by: Alhamoud, Kumail, et al.
Published: (2025)
by: Alhamoud, Kumail, et al.
Published: (2025)
DocR1: Evidence Page-Guided GRPO for Multi-Page Document Understanding
by: Xiong, Junyu, et al.
Published: (2025)
by: Xiong, Junyu, et al.
Published: (2025)
EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture
by: He, Xin, et al.
Published: (2025)
by: He, Xin, et al.
Published: (2025)
ViSA-Enhanced Aerial VLN: A Visual-Spatial Reasoning Enhanced Framework for Aerial Vision-Language Navigation
by: Tong, Haoyu, et al.
Published: (2026)
by: Tong, Haoyu, et al.
Published: (2026)
Analyzing Fine-Grained Alignment and Enhancing Vision Understanding in Multimodal Language Models
by: Jiang, Jiachen, et al.
Published: (2025)
by: Jiang, Jiachen, et al.
Published: (2025)
Similar Items
-
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
by: Dong, Haoyu, et al.
Published: (2024) -
TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models
by: He, Xinyi, et al.
Published: (2025) -
Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning
by: Xing, Junjie, et al.
Published: (2024) -
MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark
by: Xing, Junjie, et al.
Published: (2025) -
SheetBrain: A Neuro-Symbolic Agent for Accurate Reasoning over Complex and Large Spreadsheets
by: Wang, Ziwei, et al.
Published: (2025)