Saved in:
| Main Authors: | Wang, Zihe, Wang, Yihuan, Cui, Haiyang Yu. Zhiyong, Liao, Xiaojian, Wang, Chengcheng, Tian, Yonglin, Tong, Yongxin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.16495 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models
by: Ren, Yilong, et al.
Published: (2024)
by: Ren, Yilong, et al.
Published: (2024)
MindVL: Towards Efficient and Effective Training of Multimodal Large Language Models on Ascend NPUs
by: Chen, Feilong, et al.
Published: (2025)
by: Chen, Feilong, et al.
Published: (2025)
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
by: Yu, Ya-Qi, et al.
Published: (2024)
by: Yu, Ya-Qi, et al.
Published: (2024)
Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
by: Chen, Zhawnen, et al.
Published: (2024)
by: Chen, Zhawnen, et al.
Published: (2024)
A Step Toward Federated Pretraining of Multimodal Large Language Models
by: Xiong, Baochen, et al.
Published: (2026)
by: Xiong, Baochen, et al.
Published: (2026)
MLLM-CL: Continual Learning for Multimodal Large Language Models
by: Zhao, Hongbo, et al.
Published: (2025)
by: Zhao, Hongbo, et al.
Published: (2025)
Intelli-Planner: Towards Customized Urban Planning via Large Language Model Empowered Reinforcement Learning
by: Yong, Xixian, et al.
Published: (2026)
by: Yong, Xixian, et al.
Published: (2026)
CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models
by: Liu, Guang, et al.
Published: (2025)
by: Liu, Guang, et al.
Published: (2025)
KoMA: Knowledge-driven Multi-agent Framework for Autonomous Driving with Large Language Models
by: Jiang, Kemou, et al.
Published: (2024)
by: Jiang, Kemou, et al.
Published: (2024)
EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
by: Wang, Lening, et al.
Published: (2024)
by: Wang, Lening, et al.
Published: (2024)
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents
by: Wang, Zihao, et al.
Published: (2025)
by: Wang, Zihao, et al.
Published: (2025)
Can Large Language Models Express Uncertainty Like Human?
by: Tao, Linwei, et al.
Published: (2025)
by: Tao, Linwei, et al.
Published: (2025)
StreetviewLLM: Extracting Geographic Information Using a Chain-of-Thought Multimodal Large Language Model
by: Li, Zongrong, et al.
Published: (2024)
by: Li, Zongrong, et al.
Published: (2024)
AccidentGPT: Accident Analysis and Prevention from V2X Environmental Perception with Multi-modal Large Model
by: Wang, Lening, et al.
Published: (2023)
by: Wang, Lening, et al.
Published: (2023)
Multimodal Generative Retrieval Model with Staged Pretraining for Food Delivery on Meituan
by: Chen, Boyu, et al.
Published: (2026)
by: Chen, Boyu, et al.
Published: (2026)
From Black Boxes to Transparent Minds: Evaluating and Enhancing the Theory of Mind in Multimodal Large Language Models
by: Li, Xinyang, et al.
Published: (2025)
by: Li, Xinyang, et al.
Published: (2025)
Abstraction Generation for Generalized Planning with Pretrained Large Language Models
by: Cui, Zhenhe, et al.
Published: (2026)
by: Cui, Zhenhe, et al.
Published: (2026)
PediaMind-R1: A Temperament-Aware Language Model for Personalized Early Childhood Care Reasoning via Cognitive Modeling and Preference Alignment
by: Zhang, Zihe, et al.
Published: (2025)
by: Zhang, Zihe, et al.
Published: (2025)
Malla: Demystifying Real-world Large Language Model Integrated Malicious Services
by: Lin, Zilong, et al.
Published: (2024)
by: Lin, Zilong, et al.
Published: (2024)
Mind over Space: Can Multimodal Large Language Models Mentally Navigate?
by: Zhu, Qihui, et al.
Published: (2026)
by: Zhu, Qihui, et al.
Published: (2026)
Mind Scramble: Unveiling Large Language Model Psychology Via Typoglycemia
by: Yu, Miao, et al.
Published: (2024)
by: Yu, Miao, et al.
Published: (2024)
Discrete Diffusion in Large Language and Multimodal Models: A Survey
by: Yu, Runpeng, et al.
Published: (2025)
by: Yu, Runpeng, et al.
Published: (2025)
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
by: Zhang, Chi, et al.
Published: (2024)
by: Zhang, Chi, et al.
Published: (2024)
MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
by: Li, Jiazheng, et al.
Published: (2025)
by: Li, Jiazheng, et al.
Published: (2025)
Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure
by: Yang, Haotong, et al.
Published: (2023)
by: Yang, Haotong, et al.
Published: (2023)
Rec-GPT4V: Multimodal Recommendation with Large Vision-Language Models
by: Liu, Yuqing, et al.
Published: (2024)
by: Liu, Yuqing, et al.
Published: (2024)
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse
by: Li, Muyao, et al.
Published: (2025)
by: Li, Muyao, et al.
Published: (2025)
Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models
by: Huang, Chengkai, et al.
Published: (2025)
by: Huang, Chengkai, et al.
Published: (2025)
FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing
by: Liu, Xiao-Yang, et al.
Published: (2024)
by: Liu, Xiao-Yang, et al.
Published: (2024)
Large Language Models in Operations Research: Methods, Applications, and Challenges
by: Wang, Yang, et al.
Published: (2025)
by: Wang, Yang, et al.
Published: (2025)
BackdoorMBTI: A Backdoor Learning Multimodal Benchmark Tool Kit for Backdoor Defense Evaluation
by: Yu, Haiyang, et al.
Published: (2024)
by: Yu, Haiyang, et al.
Published: (2024)
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
by: Wang, Zihao, et al.
Published: (2023)
by: Wang, Zihao, et al.
Published: (2023)
MultiMind: Enhancing Werewolf Agents with Multimodal Reasoning and Theory of Mind
by: Zhang, Zheng, et al.
Published: (2025)
by: Zhang, Zheng, et al.
Published: (2025)
Evaluating Large Language Models on Multimodal Chemistry Olympiad Exams
by: Cui, Yiming, et al.
Published: (2025)
by: Cui, Yiming, et al.
Published: (2025)
LMFusion: Adapting Pretrained Language Models for Multimodal Generation
by: Shi, Weijia, et al.
Published: (2024)
by: Shi, Weijia, et al.
Published: (2024)
ConfTuner: Training Large Language Models to Express Their Confidence Verbally
by: Li, Yibo, et al.
Published: (2025)
by: Li, Yibo, et al.
Published: (2025)
Pretraining Large Language Models with NVFP4
by: NVIDIA, et al.
Published: (2025)
by: NVIDIA, et al.
Published: (2025)
Probing the Robustness of Vision-Language Pretrained Models: A Multimodal Adversarial Attack Approach
by: Guan, Jiwei, et al.
Published: (2024)
by: Guan, Jiwei, et al.
Published: (2024)
A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing
by: Wang, Yu, et al.
Published: (2024)
by: Wang, Yu, et al.
Published: (2024)
Similar Items
-
TPLLM: A Traffic Prediction Framework Based on Pretrained Large Language Models
by: Ren, Yilong, et al.
Published: (2024) -
MindVL: Towards Efficient and Effective Training of Multimodal Large Language Models on Ascend NPUs
by: Chen, Feilong, et al.
Published: (2025) -
TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
by: Yu, Ya-Qi, et al.
Published: (2024) -
Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models
by: Chen, Zhawnen, et al.
Published: (2024) -
A Step Toward Federated Pretraining of Multimodal Large Language Models
by: Xiong, Baochen, et al.
Published: (2026)