Saved in:
| Main Authors: | Xiong, Jing, Chen, Qiujiang, Ye, Fanghua, Wan, Zhongwei, Zheng, Chuanyang, Zhao, Chenyang, Shen, Hui, Li, Hanbo, Tao, Chaofan, Tan, Haochen, Bai, Haoli, Shang, Lifeng, Kong, Lingpeng, Wong, Ngai |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15148 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ParallelComp: Parallel Long-Context Compressor for Length Extrapolation
by: Xiong, Jing, et al.
Published: (2025)
by: Xiong, Jing, et al.
Published: (2025)
OVD: On-policy Verbal Distillation
by: Xiong, Jing, et al.
Published: (2026)
by: Xiong, Jing, et al.
Published: (2026)
UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from an Uncertainty-Aware Perspective
by: Xiong, Jing, et al.
Published: (2024)
by: Xiong, Jing, et al.
Published: (2024)
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
by: Li, Zixuan, et al.
Published: (2024)
by: Li, Zixuan, et al.
Published: (2024)
MMFormalizer: Multimodal Autoformalization in the Wild
by: Xiong, Jing, et al.
Published: (2026)
by: Xiong, Jing, et al.
Published: (2026)
CodeComp: Structural KV Cache Compression for Agentic Coding
by: Chen, Qiujiang, et al.
Published: (2026)
by: Chen, Qiujiang, et al.
Published: (2026)
SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
by: Xu, Wendong, et al.
Published: (2025)
by: Xu, Wendong, et al.
Published: (2025)
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone
by: Ye, Jiacheng, et al.
Published: (2025)
by: Ye, Jiacheng, et al.
Published: (2025)
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
by: Tao, Chaofan, et al.
Published: (2024)
by: Tao, Chaofan, et al.
Published: (2024)
DoPE: Denoising Rotary Position Embedding
by: Xiong, Jing, et al.
Published: (2025)
by: Xiong, Jing, et al.
Published: (2025)
GRPO-VPS: Enhancing Group Relative Policy Optimization with Verifiable Process Supervision for Effective Reasoning
by: Wang, Jingyi, et al.
Published: (2026)
by: Wang, Jingyi, et al.
Published: (2026)
REAgent: Requirement-Driven LLM Agents for Software Issue Resolution
by: Kuang, Shiqi, et al.
Published: (2026)
by: Kuang, Shiqi, et al.
Published: (2026)
LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
by: Liu, Weichu, et al.
Published: (2025)
by: Liu, Weichu, et al.
Published: (2025)
MMSearch-Plus: Benchmarking Provenance-Aware Search for Multimodal Browsing Agents
by: Tao, Xijia, et al.
Published: (2025)
by: Tao, Xijia, et al.
Published: (2025)
From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation
by: Jiang, Yuxin, et al.
Published: (2026)
by: Jiang, Yuxin, et al.
Published: (2026)
Enhancing Test-Time Scaling of Large Language Models with Hierarchical Retrieval-Augmented MCTS
by: Dou, Alex ZH, et al.
Published: (2025)
by: Dou, Alex ZH, et al.
Published: (2025)
Autoregressive Models in Vision: A Survey
by: Xiong, Jing, et al.
Published: (2024)
by: Xiong, Jing, et al.
Published: (2024)
Scaling Reasoning without Attention
by: Zhao, Xueliang, et al.
Published: (2025)
by: Zhao, Xueliang, et al.
Published: (2025)
CktFormalizer: Autoformalization of Natural Language into Circuit Representations
by: Xiong, Jing, et al.
Published: (2026)
by: Xiong, Jing, et al.
Published: (2026)
DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning
by: Wan, Zhongwei, et al.
Published: (2026)
by: Wan, Zhongwei, et al.
Published: (2026)
Gradually Excavating External Knowledge for Implicit Complex Question Answering
by: Liu, Chang, et al.
Published: (2026)
by: Liu, Chang, et al.
Published: (2026)
MEIT: Multimodal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
by: Wan, Zhongwei, et al.
Published: (2024)
by: Wan, Zhongwei, et al.
Published: (2024)
Faster and Better LLMs via Latency-Aware Test-Time Scaling
by: Wang, Zili, et al.
Published: (2025)
by: Wang, Zili, et al.
Published: (2025)
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
by: Wu, Taiqiang, et al.
Published: (2024)
by: Wu, Taiqiang, et al.
Published: (2024)
SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving
by: Tao, Chaofan, et al.
Published: (2026)
by: Tao, Chaofan, et al.
Published: (2026)
The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
by: Chen, Jierun, et al.
Published: (2025)
by: Chen, Jierun, et al.
Published: (2025)
Multi-GPU MBE(3)-OSV-MP2 for Performant Large-Scale ab initio Calculations
by: Liang, Qiujiang, et al.
Published: (2026)
by: Liang, Qiujiang, et al.
Published: (2026)
PEFT-as-an-Attack! Jailbreaking Language Models during Federated Parameter-Efficient Fine-Tuning
by: Li, Shenghui, et al.
Published: (2024)
by: Li, Shenghui, et al.
Published: (2024)
DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning
by: Shi, Wenxuan, et al.
Published: (2025)
by: Shi, Wenxuan, et al.
Published: (2025)
AnchorTP: Resilient LLM Inference with State-Preserving Elastic Tensor Parallelism
by: Xu, Wendong, et al.
Published: (2025)
by: Xu, Wendong, et al.
Published: (2025)
Entropy Centroids as Intrinsic Rewards for Test-Time Scaling
by: Zhao, Wenshuo, et al.
Published: (2026)
by: Zhao, Wenshuo, et al.
Published: (2026)
Any-to-any Speaker Attribute Perturbation for Asynchronous Voice Anonymization
by: Chen, Liping, et al.
Published: (2025)
by: Chen, Liping, et al.
Published: (2025)
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
by: Wan, Zhongwei, et al.
Published: (2022)
by: Wan, Zhongwei, et al.
Published: (2022)
PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning
by: Zhao, Xueliang, et al.
Published: (2025)
by: Zhao, Xueliang, et al.
Published: (2025)
Reasoning Does Not Necessarily Improve Role-Playing Ability
by: Feng, Xiachong, et al.
Published: (2025)
by: Feng, Xiachong, et al.
Published: (2025)
Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling
by: Prange, Jakob, et al.
Published: (2021)
by: Prange, Jakob, et al.
Published: (2021)
The Linear Attention Resurrection in Vision Transformer
by: Zheng, Chuanyang
Published: (2025)
by: Zheng, Chuanyang
Published: (2025)
iFormer: Integrating ConvNet and Transformer for Mobile Application
by: Zheng, Chuanyang
Published: (2025)
by: Zheng, Chuanyang
Published: (2025)
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
by: Liu, Che, et al.
Published: (2024)
by: Liu, Che, et al.
Published: (2024)
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
by: Shen, Hui, et al.
Published: (2025)
by: Shen, Hui, et al.
Published: (2025)
Similar Items
-
ParallelComp: Parallel Long-Context Compressor for Length Extrapolation
by: Xiong, Jing, et al.
Published: (2025) -
OVD: On-policy Verbal Distillation
by: Xiong, Jing, et al.
Published: (2026) -
UNComp: Can Matrix Entropy Uncover Sparsity? -- A Compressor Design from an Uncertainty-Aware Perspective
by: Xiong, Jing, et al.
Published: (2024) -
UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation
by: Li, Zixuan, et al.
Published: (2024) -
MMFormalizer: Multimodal Autoformalization in the Wild
by: Xiong, Jing, et al.
Published: (2026)