Saved in:
| Main Authors: | An, Chenxin, Zhang, Jun, Zhong, Ming, Li, Lei, Gong, Shansan, Luo, Yao, Xu, Jingjing, Kong, Lingpeng |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.18745 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Training-Free Long-Context Scaling of Large Language Models
by: An, Chenxin, et al.
Published: (2024)
by: An, Chenxin, et al.
Published: (2024)
GIRAFFE: Design Choices for Extending the Context Length of Visual Language Models
by: Li, Mukai, et al.
Published: (2024)
by: Li, Mukai, et al.
Published: (2024)
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
by: Gong, Shansan, et al.
Published: (2025)
by: Gong, Shansan, et al.
Published: (2025)
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
by: Ye, Jiacheng, et al.
Published: (2024)
by: Ye, Jiacheng, et al.
Published: (2024)
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
by: Gong, Shansan, et al.
Published: (2024)
by: Gong, Shansan, et al.
Published: (2024)
BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models
by: Zhao, Xueliang, et al.
Published: (2024)
by: Zhao, Xueliang, et al.
Published: (2024)
Temporal Reasoning Transfer from Text to Video
by: Li, Lei, et al.
Published: (2024)
by: Li, Lei, et al.
Published: (2024)
Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone
by: Ye, Jiacheng, et al.
Published: (2025)
by: Ye, Jiacheng, et al.
Published: (2025)
Reasoning Does Not Necessarily Improve Role-Playing Ability
by: Feng, Xiachong, et al.
Published: (2025)
by: Feng, Xiachong, et al.
Published: (2025)
Dream-Coder 7B: An Open Diffusion Language Model for Code
by: Xie, Zhihui, et al.
Published: (2025)
by: Xie, Zhihui, et al.
Published: (2025)
DreamOn: Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas
by: Wu, Zirui, et al.
Published: (2026)
by: Wu, Zirui, et al.
Published: (2026)
ParallelComp: Parallel Long-Context Compressor for Length Extrapolation
by: Xiong, Jing, et al.
Published: (2025)
by: Xiong, Jing, et al.
Published: (2025)
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models
by: Ye, Jiacheng, et al.
Published: (2024)
by: Ye, Jiacheng, et al.
Published: (2024)
Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis
by: Zhu, Wenhao, et al.
Published: (2023)
by: Zhu, Wenhao, et al.
Published: (2023)
Intrinsic Entropy of Context Length Scaling in LLMs
by: Shi, Jingzhe, et al.
Published: (2025)
by: Shi, Jingzhe, et al.
Published: (2025)
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers
by: Li, Qintong, et al.
Published: (2024)
by: Li, Qintong, et al.
Published: (2024)
Haste Makes Waste: Evaluating Planning Abilities of LLMs for Efficient and Feasible Multitasking with Time Constraints Between Actions
by: Wu, Zirui, et al.
Published: (2025)
by: Wu, Zirui, et al.
Published: (2025)
Understanding the Role of LLMs in Multimodal Evaluation Benchmarks
by: Jiang, Botian, et al.
Published: (2024)
by: Jiang, Botian, et al.
Published: (2024)
Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation
by: Lu, Yi, et al.
Published: (2025)
by: Lu, Yi, et al.
Published: (2025)
Long-Short Alignment for Effective Long-Context Modeling in LLMs
by: Du, Tianqi, et al.
Published: (2025)
by: Du, Tianqi, et al.
Published: (2025)
Teaching Language Models to Critique via Reinforcement Learning
by: Xie, Zhihui, et al.
Published: (2025)
by: Xie, Zhihui, et al.
Published: (2025)
A Reparameterized Discrete Diffusion Model for Text Generation
by: Zheng, Lin, et al.
Published: (2023)
by: Zheng, Lin, et al.
Published: (2023)
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?
by: Kim, Jeonghye, et al.
Published: (2026)
by: Kim, Jeonghye, et al.
Published: (2026)
PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning
by: Zhao, Xueliang, et al.
Published: (2025)
by: Zhao, Xueliang, et al.
Published: (2025)
Exploring the Reliability of Large Language Models as Customized Evaluators for Diverse NLP Tasks
by: Li, Qintong, et al.
Published: (2023)
by: Li, Qintong, et al.
Published: (2023)
Why Does New Knowledge Create Messy Ripple Effects in LLMs?
by: Qin, Jiaxin, et al.
Published: (2024)
by: Qin, Jiaxin, et al.
Published: (2024)
Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective
by: Zhong, Meizhi, et al.
Published: (2024)
by: Zhong, Meizhi, et al.
Published: (2024)
Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling
by: Prange, Jakob, et al.
Published: (2021)
by: Prange, Jakob, et al.
Published: (2021)
Length Controlled Generation for Black-box LLMs
by: Gu, Yuxuan, et al.
Published: (2024)
by: Gu, Yuxuan, et al.
Published: (2024)
Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference
by: Xiao, Qingfa, et al.
Published: (2025)
by: Xiao, Qingfa, et al.
Published: (2025)
VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
by: Li, Lei, et al.
Published: (2024)
by: Li, Lei, et al.
Published: (2024)
Jailbreaking as a Reward Misspecification Problem
by: Xie, Zhihui, et al.
Published: (2024)
by: Xie, Zhihui, et al.
Published: (2024)
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
by: Li, Lei, et al.
Published: (2024)
by: Li, Lei, et al.
Published: (2024)
Does RAG Really Perform Bad For Long-Context Processing?
by: Luo, Kun, et al.
Published: (2025)
by: Luo, Kun, et al.
Published: (2025)
FACTTRACK: Time-Aware World State Tracking in Story Outlines
by: Lyu, Zhiheng, et al.
Published: (2024)
by: Lyu, Zhiheng, et al.
Published: (2024)
Self-Infilling Code Generation
by: Zheng, Lin, et al.
Published: (2023)
by: Zheng, Lin, et al.
Published: (2023)
Scaling Reasoning without Attention
by: Zhao, Xueliang, et al.
Published: (2025)
by: Zhao, Xueliang, et al.
Published: (2025)
How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities of Large Language Models
by: Jiang, Jiyue, et al.
Published: (2024)
by: Jiang, Jiyue, et al.
Published: (2024)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
by: Li, Qintong, et al.
Published: (2024)
by: Li, Qintong, et al.
Published: (2024)
Where Does Long-Context Supervision Actually Go? Effective-Context Exposure Balancing
by: Zhu, Jinchang, et al.
Published: (2026)
by: Zhu, Jinchang, et al.
Published: (2026)
Similar Items
-
Training-Free Long-Context Scaling of Large Language Models
by: An, Chenxin, et al.
Published: (2024) -
GIRAFFE: Design Choices for Extending the Context Length of Visual Language Models
by: Li, Mukai, et al.
Published: (2024) -
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation
by: Gong, Shansan, et al.
Published: (2025) -
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
by: Ye, Jiacheng, et al.
Published: (2024) -
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
by: Gong, Shansan, et al.
Published: (2024)