Saved in:
| Main Authors: | Goddard, Charles, Neto, Fernando Fernandes |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.06607 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
by: Siriwardhana, Shamane, et al.
Published: (2024)
by: Siriwardhana, Shamane, et al.
Published: (2024)
OrthoRank: Token Selection via Sink Token Orthogonality for Efficient LLM inference
by: Shin, Seungjun, et al.
Published: (2025)
by: Shin, Seungjun, et al.
Published: (2025)
Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers
by: Yang, Wang, et al.
Published: (2026)
by: Yang, Wang, et al.
Published: (2026)
Reparameterized LLM Training via Orthogonal Equivalence Transformation
by: Qiu, Zeju, et al.
Published: (2025)
by: Qiu, Zeju, et al.
Published: (2025)
Influential Language Data Selection via Gradient Trajectory Pursuit
by: Deng, Zhiwei, et al.
Published: (2024)
by: Deng, Zhiwei, et al.
Published: (2024)
Training-Trajectory-Aware Token Selection
by: Shen, Zhanming, et al.
Published: (2026)
by: Shen, Zhanming, et al.
Published: (2026)
POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation
by: Qiu, Zeju, et al.
Published: (2026)
by: Qiu, Zeju, et al.
Published: (2026)
T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings
by: Deiseroth, Björn, et al.
Published: (2024)
by: Deiseroth, Björn, et al.
Published: (2024)
Lossless Token Sequence Compression via Meta-Tokens
by: Harvill, John, et al.
Published: (2025)
by: Harvill, John, et al.
Published: (2025)
Identifying Intervenable and Interpretable Features via Orthogonality Regularization
by: Miller, Moritz, et al.
Published: (2026)
by: Miller, Moritz, et al.
Published: (2026)
Training-Free Exponential Context Extension via Cascading KV Cache
by: Willette, Jeffrey, et al.
Published: (2024)
by: Willette, Jeffrey, et al.
Published: (2024)
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
by: Liu, Wei, et al.
Published: (2025)
by: Liu, Wei, et al.
Published: (2025)
Training Large Language Models To Reason In Parallel With Global Forking Tokens
by: Jia, Sheng, et al.
Published: (2025)
by: Jia, Sheng, et al.
Published: (2025)
Think before you speak: Training Language Models With Pause Tokens
by: Goyal, Sachin, et al.
Published: (2023)
by: Goyal, Sachin, et al.
Published: (2023)
Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation
by: Ma, Xinyu, et al.
Published: (2024)
by: Ma, Xinyu, et al.
Published: (2024)
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
by: Lin, Bokai, et al.
Published: (2024)
by: Lin, Bokai, et al.
Published: (2024)
Arcee's MergeKit: A Toolkit for Merging Large Language Models
by: Goddard, Charles, et al.
Published: (2024)
by: Goddard, Charles, et al.
Published: (2024)
Cold-Start Personalization via Training-Free Priors from Structured World Models
by: Bose, Avinandan, et al.
Published: (2026)
by: Bose, Avinandan, et al.
Published: (2026)
ManifoldKV: Training-Free KV Cache Compression via Euclidean Outlier Detection
by: Datta, Debajyoti, et al.
Published: (2026)
by: Datta, Debajyoti, et al.
Published: (2026)
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation
by: Gauthier-Caron, Thomas, et al.
Published: (2024)
by: Gauthier-Caron, Thomas, et al.
Published: (2024)
A General and Efficient Training for Transformer via Token Expansion
by: Huang, Wenxuan, et al.
Published: (2024)
by: Huang, Wenxuan, et al.
Published: (2024)
OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
by: Gadhikar, Advait, et al.
Published: (2025)
by: Gadhikar, Advait, et al.
Published: (2025)
Adapting Language Models via Token Translation
by: Feng, Zhili, et al.
Published: (2024)
by: Feng, Zhili, et al.
Published: (2024)
Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection
by: Yang, Ziyu, et al.
Published: (2026)
by: Yang, Ziyu, et al.
Published: (2026)
The Geometries of Truth Are Orthogonal Across Tasks
by: Azizian, Waiss, et al.
Published: (2025)
by: Azizian, Waiss, et al.
Published: (2025)
Orthogonal Finetuning for Direct Preference Optimization
by: Yang, Chenxu, et al.
Published: (2024)
by: Yang, Chenxu, et al.
Published: (2024)
CYCLE-INSTRUCT: Fully Seed-Free Instruction Tuning via Dual Self-Training and Cycle Consistency
by: Shen, Zhanming, et al.
Published: (2025)
by: Shen, Zhanming, et al.
Published: (2025)
TokenButler: Token Importance is Predictable
by: Akhauri, Yash, et al.
Published: (2025)
by: Akhauri, Yash, et al.
Published: (2025)
Token-Level LLM Collaboration via FusionRoute
by: Xiong, Nuoya, et al.
Published: (2026)
by: Xiong, Nuoya, et al.
Published: (2026)
When Answers Stray from Questions: Hallucination Detection via Question-Answer Orthogonal Decomposition
by: Yao, Siyang, et al.
Published: (2026)
by: Yao, Siyang, et al.
Published: (2026)
Think Clearly: Improving Reasoning via Redundant Token Pruning
by: Choi, Daewon, et al.
Published: (2025)
by: Choi, Daewon, et al.
Published: (2025)
Evaluation of Large Language Models via Coupled Token Generation
by: Benz, Nina Corvelo, et al.
Published: (2025)
by: Benz, Nina Corvelo, et al.
Published: (2025)
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge
by: Tang, Yao, et al.
Published: (2026)
by: Tang, Yao, et al.
Published: (2026)
TokenSelect: Efficient Long-Context Inference and Length Extrapolation for LLMs via Dynamic Token-Level KV Cache Selection
by: Wu, Wei, et al.
Published: (2024)
by: Wu, Wei, et al.
Published: (2024)
Emergent Representations of Program Semantics in Language Models Trained on Programs
by: Jin, Charles, et al.
Published: (2023)
by: Jin, Charles, et al.
Published: (2023)
SOMP: Scalable Gradient Inversion for Large Language Models via Subspace-Guided Orthogonal Matching Pursuit
by: Li, Yibo, et al.
Published: (2026)
by: Li, Yibo, et al.
Published: (2026)
Adversarial Tokenization
by: Geh, Renato Lui, et al.
Published: (2025)
by: Geh, Renato Lui, et al.
Published: (2025)
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
by: Shi, Haizhou, et al.
Published: (2024)
by: Shi, Haizhou, et al.
Published: (2024)
Selective Preference Optimization via Token-Level Reward Function Estimation
by: Yang, Kailai, et al.
Published: (2024)
by: Yang, Kailai, et al.
Published: (2024)
Similar Items
-
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
by: Siriwardhana, Shamane, et al.
Published: (2024) -
OrthoRank: Token Selection via Sink Token Orthogonality for Efficient LLM inference
by: Shin, Seungjun, et al.
Published: (2025) -
Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers
by: Yang, Wang, et al.
Published: (2026) -
Reparameterized LLM Training via Orthogonal Equivalence Transformation
by: Qiu, Zeju, et al.
Published: (2025) -
Influential Language Data Selection via Gradient Trajectory Pursuit
by: Deng, Zhiwei, et al.
Published: (2024)