Saved in:
| Main Authors: | Dietz, Florian, Klakow, Dietrich |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.00684 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Block-Operations: Using Modular Routing to Improve Compositional Generalization
by: Dietz, Florian, et al.
Published: (2024)
by: Dietz, Florian, et al.
Published: (2024)
Comgra: A Tool for Analyzing and Debugging Neural Networks
by: Dietz, Florian, et al.
Published: (2024)
by: Dietz, Florian, et al.
Published: (2024)
Decoding Partial Differential Equations: Cross-Modal Adaptation of Decoder-only Models to PDEs
by: García-de-Herreros, Paloma, et al.
Published: (2025)
by: García-de-Herreros, Paloma, et al.
Published: (2025)
Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches
by: Azime, Israel Abebe, et al.
Published: (2025)
by: Azime, Israel Abebe, et al.
Published: (2025)
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks
by: Ramesh, Shyam Sundhar, et al.
Published: (2026)
by: Ramesh, Shyam Sundhar, et al.
Published: (2026)
From Task Solving to Robust Real-World Adaptation in LLM Agents
by: Pezeshkpour, Pouya, et al.
Published: (2026)
by: Pezeshkpour, Pouya, et al.
Published: (2026)
Language Models Do Hard Arithmetic Tasks Easily and Hardly Do Easy Arithmetic Tasks
by: Gambardella, Andrew, et al.
Published: (2024)
by: Gambardella, Andrew, et al.
Published: (2024)
Transformers for molecular property prediction: Domain adaptation efficiently improves performance
by: Sultan, Afnan, et al.
Published: (2025)
by: Sultan, Afnan, et al.
Published: (2025)
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
by: He, Yifei, et al.
Published: (2024)
by: He, Yifei, et al.
Published: (2024)
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
by: Chen, Yan-Lun, et al.
Published: (2025)
by: Chen, Yan-Lun, et al.
Published: (2025)
Fast KVzip: Efficient and Accurate LLM Inference with Gated KV Eviction
by: Kim, Jang-Hyun, et al.
Published: (2026)
by: Kim, Jang-Hyun, et al.
Published: (2026)
What explains the success of cross-modal fine-tuning with ORCA?
by: García-de-Herreros, Paloma, et al.
Published: (2024)
by: García-de-Herreros, Paloma, et al.
Published: (2024)
Explicitly Encoding Structural Symmetry is Key to Length Generalization in Arithmetic Tasks
by: Sabbaghi, Mahdi, et al.
Published: (2024)
by: Sabbaghi, Mahdi, et al.
Published: (2024)
Reliability-Aware Adaptive Self-Consistency for Efficient Sampling in LLM Reasoning
by: Kim, Junseok, et al.
Published: (2026)
by: Kim, Junseok, et al.
Published: (2026)
Investigating Task Arithmetic for Zero-Shot Information Retrieval
by: Braga, Marco, et al.
Published: (2025)
by: Braga, Marco, et al.
Published: (2025)
Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models
by: Gangwar, Neeraj, et al.
Published: (2025)
by: Gangwar, Neeraj, et al.
Published: (2025)
OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step
by: Dugan, Owen, et al.
Published: (2024)
by: Dugan, Owen, et al.
Published: (2024)
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure
by: Cho, Hanseul, et al.
Published: (2024)
by: Cho, Hanseul, et al.
Published: (2024)
Solving the Inverse Alignment Problem for Efficient RLHF
by: Krishna, Shambhavi, et al.
Published: (2024)
by: Krishna, Shambhavi, et al.
Published: (2024)
Gated Linear Attention Transformers with Hardware-Efficient Training
by: Yang, Songlin, et al.
Published: (2023)
by: Yang, Songlin, et al.
Published: (2023)
Exploring the Effectiveness and Consistency of Task Selection in Intermediate-Task Transfer Learning
by: Lin, Pin-Jie, et al.
Published: (2024)
by: Lin, Pin-Jie, et al.
Published: (2024)
Utilizing Multimodal Data for Edge Case Robust Call-sign Recognition and Understanding
by: Blatt, Alexander, et al.
Published: (2024)
by: Blatt, Alexander, et al.
Published: (2024)
Efficient Annotator Reliability Assessment with EffiARA
by: Cook, Owen, et al.
Published: (2025)
by: Cook, Owen, et al.
Published: (2025)
Steering Language Models with Weight Arithmetic
by: Fierro, Constanza, et al.
Published: (2025)
by: Fierro, Constanza, et al.
Published: (2025)
Language Models are Symbolic Learners in Arithmetic
by: Deng, Chunyuan, et al.
Published: (2024)
by: Deng, Chunyuan, et al.
Published: (2024)
PARSE: LLM Driven Schema Optimization for Reliable Entity Extraction
by: Shrimal, Anubhav, et al.
Published: (2025)
by: Shrimal, Anubhav, et al.
Published: (2025)
From Parameters to Data: A Task-Parameter-Guided Fine-Tuning Pipeline for Efficient LLM Alignment
by: Chen, Hao, et al.
Published: (2026)
by: Chen, Hao, et al.
Published: (2026)
Anticipate & Act : Integrating LLMs and Classical Planning for Efficient Task Execution in Household Environments
by: Arora, Raghav, et al.
Published: (2025)
by: Arora, Raghav, et al.
Published: (2025)
Learn-to-learn on Arbitrary Textual Conditioning: A Hypernetwork-Driven Meta-Gated LLM
by: Ji, Luo, et al.
Published: (2026)
by: Ji, Luo, et al.
Published: (2026)
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
by: De, Soham, et al.
Published: (2024)
by: De, Soham, et al.
Published: (2024)
TELL-TALE: Task Efficient LLMs with Task Aware Layer Elimination
by: Naim, Omar, et al.
Published: (2025)
by: Naim, Omar, et al.
Published: (2025)
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
by: Singhi, Nishad, et al.
Published: (2025)
by: Singhi, Nishad, et al.
Published: (2025)
Disentangling Language Roles in Multilingual LLM Task Execution
by: Zhan, Qishi, et al.
Published: (2026)
by: Zhan, Qishi, et al.
Published: (2026)
Max It or Miss It: Benchmarking LLM On Solving Extremal Problems
by: Gao, Binxin, et al.
Published: (2025)
by: Gao, Binxin, et al.
Published: (2025)
Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling
by: Acharya, Rishiraj
Published: (2025)
by: Acharya, Rishiraj
Published: (2025)
PGF-Net: A Progressive Gated-Fusion Framework for Efficient Multimodal Sentiment Analysis
by: Wen, Bin, et al.
Published: (2025)
by: Wen, Bin, et al.
Published: (2025)
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization
by: She, Shuaijie, et al.
Published: (2025)
by: She, Shuaijie, et al.
Published: (2025)
$\textbf{AGT$^{AO}$}$: Robust and Stabilized LLM Unlearning via Adversarial Gating Training with Adaptive Orthogonality
by: Li, Pengyu, et al.
Published: (2026)
by: Li, Pengyu, et al.
Published: (2026)
From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers
by: Duan, Shaoxiong, et al.
Published: (2023)
by: Duan, Shaoxiong, et al.
Published: (2023)
Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures
by: Chang, Fu-Chieh, et al.
Published: (2024)
by: Chang, Fu-Chieh, et al.
Published: (2024)
Similar Items
-
Block-Operations: Using Modular Routing to Improve Compositional Generalization
by: Dietz, Florian, et al.
Published: (2024) -
Comgra: A Tool for Analyzing and Debugging Neural Networks
by: Dietz, Florian, et al.
Published: (2024) -
Decoding Partial Differential Equations: Cross-Modal Adaptation of Decoder-only Models to PDEs
by: García-de-Herreros, Paloma, et al.
Published: (2025) -
Accept or Deny? Evaluating LLM Fairness and Performance in Loan Approval across Table-to-Text Serialization Approaches
by: Azime, Israel Abebe, et al.
Published: (2025) -
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks
by: Ramesh, Shyam Sundhar, et al.
Published: (2026)