Saved in:
| Main Authors: | Guo, Yiduo, Fu, Jie, Zhang, Huishuai, Zhao, Dongyan, Shen, Yikang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.14833 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025)
by: Guo, Yiduo, et al.
Published: (2025)
Evidence-Enhanced Triplet Generation Framework for Hallucination Alleviation in Generative Question Answering
by: Du, Haowei, et al.
Published: (2024)
by: Du, Haowei, et al.
Published: (2024)
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion
by: Zhang, Zekai, et al.
Published: (2024)
by: Zhang, Zekai, et al.
Published: (2024)
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
by: Gong, Zhuocheng, et al.
Published: (2025)
by: Gong, Zhuocheng, et al.
Published: (2025)
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
by: Du, Wenyu, et al.
Published: (2024)
by: Du, Wenyu, et al.
Published: (2024)
ReasVQA: Advancing VideoQA with Imperfect Reasoning Process
by: Liang, Jianxin, et al.
Published: (2025)
by: Liang, Jianxin, et al.
Published: (2025)
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
by: Yuan, Danlong, et al.
Published: (2024)
by: Yuan, Danlong, et al.
Published: (2024)
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
by: Wang, Yueqian, et al.
Published: (2024)
by: Wang, Yueqian, et al.
Published: (2024)
Beyond Isolated Facts: Synthesizing Narrative and Grounded Supervision for VideoQA
by: Liang, Jianxin, et al.
Published: (2025)
by: Liang, Jianxin, et al.
Published: (2025)
MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
by: Wang, Yueqian, et al.
Published: (2025)
by: Wang, Yueqian, et al.
Published: (2025)
AIDBench: A benchmark for evaluating the authorship identification capability of large language models
by: Wen, Zichen, et al.
Published: (2024)
by: Wen, Zichen, et al.
Published: (2024)
Shorten After You're Right: Lazy Length Penalties for Reasoning RL
by: Yuan, Danlong, et al.
Published: (2025)
by: Yuan, Danlong, et al.
Published: (2025)
Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
by: Gong, Zhuocheng, et al.
Published: (2024)
by: Gong, Zhuocheng, et al.
Published: (2024)
Structured Code Representations Enable Data-Efficient Adaptation of Code Language Models
by: Agarwal, Mayank, et al.
Published: (2024)
by: Agarwal, Mayank, et al.
Published: (2024)
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
by: Cheng, Xin, et al.
Published: (2024)
by: Cheng, Xin, et al.
Published: (2024)
Bridging the Fairness Gap: Enhancing Pre-trained Models with LLM-Generated Sentences
by: Yu, Liu, et al.
Published: (2025)
by: Yu, Liu, et al.
Published: (2025)
Towards Effective and Efficient Continual Pre-training of Large Language Models
by: Chen, Jie, et al.
Published: (2024)
by: Chen, Jie, et al.
Published: (2024)
De-Anonymization at Scale via Tournament-Style Attribution
by: Zhang, Lirui, et al.
Published: (2026)
by: Zhang, Lirui, et al.
Published: (2026)
Efficient Continual Pre-training of LLMs for Low-resource Languages
by: Nag, Arijit, et al.
Published: (2024)
by: Nag, Arijit, et al.
Published: (2024)
Scaling Agents via Continual Pre-training
by: Su, Liangcai, et al.
Published: (2025)
by: Su, Liangcai, et al.
Published: (2025)
Efficient Continual Pre-training for Building Domain Specific Large Language Models
by: Xie, Yong, et al.
Published: (2023)
by: Xie, Yong, et al.
Published: (2023)
GeoBuildBench: A Benchmark for Interactive and Executable Geometry Construction from Natural Language
by: Kim, Jinwoong, et al.
Published: (2026)
by: Kim, Jinwoong, et al.
Published: (2026)
The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing
by: Kaneko, Masahiro, et al.
Published: (2024)
by: Kaneko, Masahiro, et al.
Published: (2024)
Can Continual Pre-training Bridge the Performance Gap between General-purpose and Specialized Language Models in the Medical Domain?
by: Doll, Niclas, et al.
Published: (2026)
by: Doll, Niclas, et al.
Published: (2026)
Revealing the Learning Dynamics of Long-Context Continual Pre-training
by: Liang, Yupu, et al.
Published: (2026)
by: Liang, Yupu, et al.
Published: (2026)
Octo-planner: On-device Language Model for Planner-Action Agents
by: Chen, Wei, et al.
Published: (2024)
by: Chen, Wei, et al.
Published: (2024)
Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering
by: Du, Haowei, et al.
Published: (2024)
by: Du, Haowei, et al.
Published: (2024)
In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting
by: Du, Haowei, et al.
Published: (2024)
by: Du, Haowei, et al.
Published: (2024)
Do LLMs "Feel"? Emotion Circuits Discovery and Control
by: Wang, Chenxi, et al.
Published: (2025)
by: Wang, Chenxi, et al.
Published: (2025)
Projective Methods for Mitigating Gender Bias in Pre-trained Language Models
by: Dawkins, Hillary, et al.
Published: (2024)
by: Dawkins, Hillary, et al.
Published: (2024)
Text to Band Gap: Pre-trained Language Models as Encoders for Semiconductor Band Gap Prediction
by: Yeh, Ying-Ting, et al.
Published: (2025)
by: Yeh, Ying-Ting, et al.
Published: (2025)
Bag of Lies: Robustness in Continuous Pre-training BERT
by: Gevers, Ine, et al.
Published: (2024)
by: Gevers, Ine, et al.
Published: (2024)
JetMoE: Reaching Llama2 Performance with 0.1M Dollars
by: Shen, Yikang, et al.
Published: (2024)
by: Shen, Yikang, et al.
Published: (2024)
Unveiling the Deficiencies of Pre-trained Text-and-Layout Models in Real-world Visually-rich Document Information Extraction
by: Zhang, Chong, et al.
Published: (2024)
by: Zhang, Chong, et al.
Published: (2024)
The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models
by: Liu, Yan, et al.
Published: (2024)
by: Liu, Yan, et al.
Published: (2024)
Evaluating Discourse Cohesion in Pre-trained Language Models
by: He, Jie, et al.
Published: (2025)
by: He, Jie, et al.
Published: (2025)
Synthesize-on-Graph: Knowledgeable Synthetic Data Generation for Continue Pre-training of Large Language Models
by: Ma, Shengjie, et al.
Published: (2025)
by: Ma, Shengjie, et al.
Published: (2025)
Improving Continual Pre-training Through Seamless Data Packing
by: Yin, Ruicheng, et al.
Published: (2025)
by: Yin, Ruicheng, et al.
Published: (2025)
Gated Linear Attention Transformers with Hardware-Efficient Training
by: Yang, Songlin, et al.
Published: (2023)
by: Yang, Songlin, et al.
Published: (2023)
SongSage: A Large Musical Language Model with Lyric Generative Pre-training
by: Guo, Jiani, et al.
Published: (2026)
by: Guo, Jiani, et al.
Published: (2026)
Similar Items
-
Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025) -
Evidence-Enhanced Triplet Generation Framework for Hallucination Alleviation in Generative Question Answering
by: Du, Haowei, et al.
Published: (2024) -
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion
by: Zhang, Zekai, et al.
Published: (2024) -
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
by: Gong, Zhuocheng, et al.
Published: (2025) -
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
by: Du, Wenyu, et al.
Published: (2024)