Saved in:
| Main Authors: | Feng, Dawei, Zhang, Yihai, Xu, Zhixuan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.09857 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Small Vocabularies, Big Gains: Pretraining and Tokenization in Time Series Models
by: Roger, Alexis, et al.
Published: (2025)
by: Roger, Alexis, et al.
Published: (2025)
Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models
by: Liu, Peijie, et al.
Published: (2025)
by: Liu, Peijie, et al.
Published: (2025)
Reflection Pretraining Enables Token-Level Self-Correction in Biological Sequence Models
by: Zhang, Xiang, et al.
Published: (2025)
by: Zhang, Xiang, et al.
Published: (2025)
Adaptive Computation Pruning for the Forgetting Transformer
by: Lin, Zhixuan, et al.
Published: (2025)
by: Lin, Zhixuan, et al.
Published: (2025)
IKnow: Instruction-Knowledge-Aware Continual Pretraining for Effective Domain Adaptation
by: Zhang, Tianyi, et al.
Published: (2025)
by: Zhang, Tianyi, et al.
Published: (2025)
ixi-GEN: Efficient Industrial sLLMs through Domain Adaptive Continual Pretraining
by: Kim, Seonwu, et al.
Published: (2025)
by: Kim, Seonwu, et al.
Published: (2025)
Incorporating Domain Knowledge into Materials Tokenization
by: Oh, Yerim, et al.
Published: (2025)
by: Oh, Yerim, et al.
Published: (2025)
FAMMA: A Benchmark for Financial Domain Multilingual Multimodal Question Answering
by: Xue, Siqiao, et al.
Published: (2024)
by: Xue, Siqiao, et al.
Published: (2024)
Optimizing Pretraining Data Mixtures with LLM-Estimated Utility
by: Held, William, et al.
Published: (2025)
by: Held, William, et al.
Published: (2025)
Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation
by: Jiang, Yuxuan, et al.
Published: (2026)
by: Jiang, Yuxuan, et al.
Published: (2026)
LLM-Oriented Token-Adaptive Knowledge Distillation
by: Xie, Xurong, et al.
Published: (2025)
by: Xie, Xurong, et al.
Published: (2025)
Explainable Token-level Noise Filtering for LLM Fine-tuning Datasets
by: Yang, Yuchen, et al.
Published: (2026)
by: Yang, Yuchen, et al.
Published: (2026)
Next Token Knowledge Tracing: Exploiting Pretrained LLM Representations to Decode Student Behaviour
by: Norris, Max, et al.
Published: (2025)
by: Norris, Max, et al.
Published: (2025)
Better RAG using Relevant Information Gain
by: Pickett, Marc, et al.
Published: (2024)
by: Pickett, Marc, et al.
Published: (2024)
WRAP++: Web discoveRy Amplified Pretraining
by: Zhou, Jiang, et al.
Published: (2026)
by: Zhou, Jiang, et al.
Published: (2026)
Token-level Direct Preference Optimization
by: Zeng, Yongcheng, et al.
Published: (2024)
by: Zeng, Yongcheng, et al.
Published: (2024)
FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model
by: Wu, Xiaobao, et al.
Published: (2024)
by: Wu, Xiaobao, et al.
Published: (2024)
Exploring the Benefits of Domain-Pretraining of Generative Large Language Models for Chemistry
by: Acharya, Anurag, et al.
Published: (2024)
by: Acharya, Anurag, et al.
Published: (2024)
Unleashing Diverse Thinking Modes in LLMs through Multi-Agent Collaboration
by: He, Zhixuan, et al.
Published: (2025)
by: He, Zhixuan, et al.
Published: (2025)
The Tokenization Bottleneck: How Vocabulary Extension Improves Chemistry Representation Learning in Pretrained Language Models
by: Kalamkar, Prathamesh, et al.
Published: (2025)
by: Kalamkar, Prathamesh, et al.
Published: (2025)
Information Gain-Guided Causal Intervention for Autonomous Debiasing Large Language Models
by: Sun, Zhouhao, et al.
Published: (2025)
by: Sun, Zhouhao, et al.
Published: (2025)
Adaptive Token Biaser: Knowledge Editing via Biasing Key Entities
by: Bi, Baolong, et al.
Published: (2024)
by: Bi, Baolong, et al.
Published: (2024)
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning
by: Li, Zheng, et al.
Published: (2025)
by: Li, Zheng, et al.
Published: (2025)
MathPile: A Billion-Token-Scale Pretraining Corpus for Math
by: Wang, Zengzhi, et al.
Published: (2023)
by: Wang, Zengzhi, et al.
Published: (2023)
Generating Pretraining Tokens from Organic Data for Data-Bound Scaling
by: Yu, Zichun, et al.
Published: (2026)
by: Yu, Zichun, et al.
Published: (2026)
Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models
by: Jiang, Peihai, et al.
Published: (2025)
by: Jiang, Peihai, et al.
Published: (2025)
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation
by: Zhang, Songming, et al.
Published: (2025)
by: Zhang, Songming, et al.
Published: (2025)
DLLMQuant: Quantizing Diffusion-based Large Language Models
by: Xu, Chen, et al.
Published: (2025)
by: Xu, Chen, et al.
Published: (2025)
APLe: Token-Wise Adaptive for Multi-Modal Prompt Learning
by: Cao, Guiming, et al.
Published: (2024)
by: Cao, Guiming, et al.
Published: (2024)
Less is More for RAG: Information Gain Pruning for Generator-Aligned Reranking and Evidence Selection
by: Song, Zhipeng, et al.
Published: (2026)
by: Song, Zhipeng, et al.
Published: (2026)
KnowledgeGain: Evaluating and Optimizing Science News Generation for Reader Learning
by: Soós, Dominik, et al.
Published: (2026)
by: Soós, Dominik, et al.
Published: (2026)
TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching
by: Nguyen, Truong, et al.
Published: (2026)
by: Nguyen, Truong, et al.
Published: (2026)
KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications
by: Bommarito, Michael J, et al.
Published: (2025)
by: Bommarito, Michael J, et al.
Published: (2025)
MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
by: Xu, Zukang, et al.
Published: (2025)
by: Xu, Zukang, et al.
Published: (2025)
Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
by: Liu, Quan, et al.
Published: (2024)
by: Liu, Quan, et al.
Published: (2024)
Training LLMs Beyond Next Token Prediction -- Filling the Mutual Information Gap
by: Yang, Chun-Hao, et al.
Published: (2025)
by: Yang, Chun-Hao, et al.
Published: (2025)
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents
by: Wang, Guoqing, et al.
Published: (2025)
by: Wang, Guoqing, et al.
Published: (2025)
DPO Meets PPO: Reinforced Token Optimization for RLHF
by: Zhong, Han, et al.
Published: (2024)
by: Zhong, Han, et al.
Published: (2024)
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning
by: Yu, Huimu, et al.
Published: (2024)
by: Yu, Huimu, et al.
Published: (2024)
Adaptive Token Boundaries: Integrating Human Chunking Mechanisms into Multimodal LLMs
by: Yu, Dongxing
Published: (2025)
by: Yu, Dongxing
Published: (2025)
Similar Items
-
Small Vocabularies, Big Gains: Pretraining and Tokenization in Time Series Models
by: Roger, Alexis, et al.
Published: (2025) -
Token Signature: Predicting Chain-of-Thought Gains with Token Decoding Feature in Large Language Models
by: Liu, Peijie, et al.
Published: (2025) -
Reflection Pretraining Enables Token-Level Self-Correction in Biological Sequence Models
by: Zhang, Xiang, et al.
Published: (2025) -
Adaptive Computation Pruning for the Forgetting Transformer
by: Lin, Zhixuan, et al.
Published: (2025) -
IKnow: Instruction-Knowledge-Aware Continual Pretraining for Effective Domain Adaptation
by: Zhang, Tianyi, et al.
Published: (2025)