:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Guo, Yiduo, Fu, Jie, Zhang, Huishuai, Zhao, Dongyan, Shen, Yikang
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.14833
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025)

Evidence-Enhanced Triplet Generation Framework for Hallucination Alleviation in Generative Question Answering
by: Du, Haowei, et al.
Published: (2024)

PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion
by: Zhang, Zekai, et al.
Published: (2024)

Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
by: Gong, Zhuocheng, et al.
Published: (2025)

Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training
by: Du, Wenyu, et al.
Published: (2024)

ReasVQA: Advancing VideoQA with Imperfect Reasoning Process
by: Liang, Jianxin, et al.
Published: (2025)

ReMamba: Equip Mamba with Effective Long-Sequence Modeling
by: Yuan, Danlong, et al.
Published: (2024)

VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
by: Wang, Yueqian, et al.
Published: (2024)

Beyond Isolated Facts: Synthesizing Narrative and Grounded Supervision for VideoQA
by: Liang, Jianxin, et al.
Published: (2025)

MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning
by: Wang, Yueqian, et al.
Published: (2025)

AIDBench: A benchmark for evaluating the authorship identification capability of large language models
by: Wen, Zichen, et al.
Published: (2024)

Shorten After You're Right: Lazy Length Penalties for Reasoning RL
by: Yuan, Danlong, et al.
Published: (2025)

Mixture-of-Modules: Reinventing Transformers as Dynamic Assemblies of Modules
by: Gong, Zhuocheng, et al.
Published: (2024)

Structured Code Representations Enable Data-Efficient Adaptation of Code Language Models
by: Agarwal, Mayank, et al.
Published: (2024)

xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
by: Cheng, Xin, et al.
Published: (2024)

Bridging the Fairness Gap: Enhancing Pre-trained Models with LLM-Generated Sentences
by: Yu, Liu, et al.
Published: (2025)

Towards Effective and Efficient Continual Pre-training of Large Language Models
by: Chen, Jie, et al.
Published: (2024)

De-Anonymization at Scale via Tournament-Style Attribution
by: Zhang, Lirui, et al.
Published: (2026)

Efficient Continual Pre-training of LLMs for Low-resource Languages
by: Nag, Arijit, et al.
Published: (2024)

Scaling Agents via Continual Pre-training
by: Su, Liangcai, et al.
Published: (2025)

Efficient Continual Pre-training for Building Domain Specific Large Language Models
by: Xie, Yong, et al.
Published: (2023)

GeoBuildBench: A Benchmark for Interactive and Executable Geometry Construction from Natural Language
by: Kim, Jinwoong, et al.
Published: (2026)

The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing
by: Kaneko, Masahiro, et al.
Published: (2024)

Can Continual Pre-training Bridge the Performance Gap between General-purpose and Specialized Language Models in the Medical Domain?
by: Doll, Niclas, et al.
Published: (2026)

Revealing the Learning Dynamics of Long-Context Continual Pre-training
by: Liang, Yupu, et al.
Published: (2026)

Octo-planner: On-device Language Model for Planner-Action Agents
by: Chen, Wei, et al.
Published: (2024)

Internal and External Knowledge Interactive Refinement Framework for Knowledge-Intensive Question Answering
by: Du, Haowei, et al.
Published: (2024)

In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting
by: Du, Haowei, et al.
Published: (2024)

Do LLMs "Feel"? Emotion Circuits Discovery and Control
by: Wang, Chenxi, et al.
Published: (2025)

Projective Methods for Mitigating Gender Bias in Pre-trained Language Models
by: Dawkins, Hillary, et al.
Published: (2024)

Text to Band Gap: Pre-trained Language Models as Encoders for Semiconductor Band Gap Prediction
by: Yeh, Ying-Ting, et al.
Published: (2025)

Bag of Lies: Robustness in Continuous Pre-training BERT
by: Gevers, Ine, et al.
Published: (2024)

JetMoE: Reaching Llama2 Performance with 0.1M Dollars
by: Shen, Yikang, et al.
Published: (2024)

Unveiling the Deficiencies of Pre-trained Text-and-Layout Models in Real-world Visually-rich Document Information Extraction
by: Zhang, Chong, et al.
Published: (2024)

The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models
by: Liu, Yan, et al.
Published: (2024)

Evaluating Discourse Cohesion in Pre-trained Language Models
by: He, Jie, et al.
Published: (2025)

Synthesize-on-Graph: Knowledgeable Synthetic Data Generation for Continue Pre-training of Large Language Models
by: Ma, Shengjie, et al.
Published: (2025)

Improving Continual Pre-training Through Seamless Data Packing
by: Yin, Ruicheng, et al.
Published: (2025)

Gated Linear Attention Transformers with Hardware-Efficient Training
by: Yang, Songlin, et al.
Published: (2023)

SongSage: A Large Musical Language Model with Lyric Generative Pre-training
by: Guo, Jiani, et al.
Published: (2026)