Saved in:
| Main Authors: | Wu, Hongxuan, Zhang, Yukun, Zhou, Xueqing |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.15580 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Continuous-Time Attention: PDE-Guided Mechanisms for Long-Sequence Transformers
by: Zhang, Yukun, et al.
Published: (2025)
by: Zhang, Yukun, et al.
Published: (2025)
Where to Add PDE Diffusion in Transformers
by: Zhang, Yukun, et al.
Published: (2025)
by: Zhang, Yukun, et al.
Published: (2025)
Understanding Transformer Architecture through Continuous Dynamics: A Partial Differential Equation Perspective
by: Zhang, Yukun, et al.
Published: (2024)
by: Zhang, Yukun, et al.
Published: (2024)
Information-Theoretic Greedy Layer-wise Training for Traffic Sign Recognition
by: Lyu, Shuyan, et al.
Published: (2025)
by: Lyu, Shuyan, et al.
Published: (2025)
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
by: Li, Ming, et al.
Published: (2025)
by: Li, Ming, et al.
Published: (2025)
Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models
by: Xiao, He, et al.
Published: (2025)
by: Xiao, He, et al.
Published: (2025)
Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis
by: Fartale, Harshwardhan, et al.
Published: (2025)
by: Fartale, Harshwardhan, et al.
Published: (2025)
From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models
by: Zhou, Chenyue, et al.
Published: (2025)
by: Zhou, Chenyue, et al.
Published: (2025)
Hierarchical Alignment: Surgical Fine-Tuning via Functional Layer Specialization in Large Language Models
by: Zhang, Yukun, et al.
Published: (2025)
by: Zhang, Yukun, et al.
Published: (2025)
A Layer-wise Analysis of Supervised Fine-Tuning
by: Zhao, Qinghua, et al.
Published: (2026)
by: Zhao, Qinghua, et al.
Published: (2026)
Fake News Detection and Manipulation Reasoning via Large Vision-Language Models
by: Jin, Ruihan, et al.
Published: (2024)
by: Jin, Ruihan, et al.
Published: (2024)
Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning
by: Zhang, Jinxu
Published: (2024)
by: Zhang, Jinxu
Published: (2024)
Layer-wise Regularized Dropout for Neural Language Models
by: Ni, Shiwen, et al.
Published: (2024)
by: Ni, Shiwen, et al.
Published: (2024)
Information-Theoretic Graph Fusion with Vision-Language-Action Model for Policy Reasoning and Dual Robotic Control
by: Li, Shunlei, et al.
Published: (2025)
by: Li, Shunlei, et al.
Published: (2025)
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models
by: Wu, Yanan, et al.
Published: (2024)
by: Wu, Yanan, et al.
Published: (2024)
SDAR-VL: Stable and Efficient Block-wise Diffusion for Vision-Language Understanding
by: Cheng, Shuang, et al.
Published: (2025)
by: Cheng, Shuang, et al.
Published: (2025)
Addition in Four Movements: Mapping Layer-wise Information Trajectories in LLMs
by: Yan, Yao
Published: (2025)
by: Yan, Yao
Published: (2025)
SlimGPT: Layer-wise Structured Pruning for Large Language Models
by: Ling, Gui, et al.
Published: (2024)
by: Ling, Gui, et al.
Published: (2024)
Multi-Scale Manifold Alignment for Interpreting Large Language Models: A Unified Information-Geometric Framework
by: Zhang, Yukun, et al.
Published: (2025)
by: Zhang, Yukun, et al.
Published: (2025)
Layer-wise Positional Bias in Short-Context Language Modeling
by: Rahimi, Maryam, et al.
Published: (2026)
by: Rahimi, Maryam, et al.
Published: (2026)
CoViPAL: Layer-wise Contextualized Visual Token Pruning for Large Vision-Language Models
by: Tang, Zicong, et al.
Published: (2025)
by: Tang, Zicong, et al.
Published: (2025)
A Novel Multimodal RUL Framework for Remaining Useful Life Estimation with Layer-wise Explanations
by: Razzaq, Waleed, et al.
Published: (2025)
by: Razzaq, Waleed, et al.
Published: (2025)
Information-Theoretic Constraints for Continual Vision-Language-Action Alignment
by: Zhao, Libang, et al.
Published: (2026)
by: Zhao, Libang, et al.
Published: (2026)
Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games
by: Piedrahita, David Guzman, et al.
Published: (2025)
by: Piedrahita, David Guzman, et al.
Published: (2025)
Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models
by: Wu, Jialiang, et al.
Published: (2025)
by: Wu, Jialiang, et al.
Published: (2025)
A Language-Signal-Vision Multimodal Framework for Multitask Cardiac Analysis
by: Zhang, Yuting, et al.
Published: (2025)
by: Zhang, Yuting, et al.
Published: (2025)
LegalReasoner: Step-wised Verification-Correction for Legal Judgment Reasoning
by: Shi, Weijie, et al.
Published: (2025)
by: Shi, Weijie, et al.
Published: (2025)
A Medical Multimodal Diagnostic Framework Integrating Vision-Language Models and Logic Tree Reasoning
by: Zang, Zelin, et al.
Published: (2025)
by: Zang, Zelin, et al.
Published: (2025)
RAG-R1: Incentivizing the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism
by: Tan, Zhiwen, et al.
Published: (2025)
by: Tan, Zhiwen, et al.
Published: (2025)
A Measure-Theoretic Analysis of Reasoning: Structural Generalization and Approximation Limits
by: Zhang, Yuyang, et al.
Published: (2026)
by: Zhang, Yuyang, et al.
Published: (2026)
Jailbreaks on Vision Language Model via Multimodal Reasoning
by: Noheria, Aarush, et al.
Published: (2026)
by: Noheria, Aarush, et al.
Published: (2026)
Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic Lens
by: Yong, Xixian, et al.
Published: (2025)
by: Yong, Xixian, et al.
Published: (2025)
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management
by: Xiong, Yi, et al.
Published: (2024)
by: Xiong, Yi, et al.
Published: (2024)
Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data
by: Liu, Xiao, et al.
Published: (2024)
by: Liu, Xiao, et al.
Published: (2024)
Enhanced Multimodal Hate Video Detection via Channel-wise and Modality-wise Fusion
by: Zhang, Yinghui, et al.
Published: (2025)
by: Zhang, Yinghui, et al.
Published: (2025)
Skip-It? Theoretical Conditions for Layer Skipping in Vision-Language Models
by: Hartman, Max, et al.
Published: (2025)
by: Hartman, Max, et al.
Published: (2025)
DeepVIS: Bridging Natural Language and Data Visualization Through Step-wise Reasoning
by: Shuai, Zhihao, et al.
Published: (2025)
by: Shuai, Zhihao, et al.
Published: (2025)
Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering
by: Liu, Hongxuan, et al.
Published: (2024)
by: Liu, Hongxuan, et al.
Published: (2024)
Rule Encoding and Compliance in Large Language Models: An Information-Theoretic Analysis
by: Diederich, Joachim
Published: (2025)
by: Diederich, Joachim
Published: (2025)
ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models
by: Shen, Qirui, et al.
Published: (2026)
by: Shen, Qirui, et al.
Published: (2026)
Similar Items
-
Continuous-Time Attention: PDE-Guided Mechanisms for Long-Sequence Transformers
by: Zhang, Yukun, et al.
Published: (2025) -
Where to Add PDE Diffusion in Transformers
by: Zhang, Yukun, et al.
Published: (2025) -
Understanding Transformer Architecture through Continuous Dynamics: A Partial Differential Equation Perspective
by: Zhang, Yukun, et al.
Published: (2024) -
Information-Theoretic Greedy Layer-wise Training for Traffic Sign Recognition
by: Lyu, Shuyan, et al.
Published: (2025) -
How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients
by: Li, Ming, et al.
Published: (2025)