Saved in:
| Main Authors: | Thakkar, Megh, Fournier, Quentin, Riemer, Matthew, Chen, Pin-Yu, Zouaq, Amal, Das, Payel, Chandar, Sarath |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.06824 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
by: Thakkar, Megh, et al.
Published: (2024)
by: Thakkar, Megh, et al.
Published: (2024)
LLMs Can't Play Hangman: On the Necessity of a Private Working Memory for Language Agents
by: Baldelli, Davide, et al.
Published: (2026)
by: Baldelli, Davide, et al.
Published: (2026)
Probabilistic Calibration Is a Trainable Capability in Language Models
by: Baldelli, Davide, et al.
Published: (2026)
by: Baldelli, Davide, et al.
Published: (2026)
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models
by: Chen, Pin-Yu, et al.
Published: (2025)
by: Chen, Pin-Yu, et al.
Published: (2025)
Exploring Quantization for Efficient Pre-Training of Transformer Language Models
by: Chitsaz, Kamran, et al.
Published: (2024)
by: Chitsaz, Kamran, et al.
Published: (2024)
Too Big to Fool: Resisting Deception in Language Models
by: Samsami, Mohammad Reza, et al.
Published: (2024)
by: Samsami, Mohammad Reza, et al.
Published: (2024)
Ontology-Constrained Generation of Domain-Specific Clinical Summaries
by: Mehenni, Gaya, et al.
Published: (2024)
by: Mehenni, Gaya, et al.
Published: (2024)
CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design
by: Govindarajan, Prashant, et al.
Published: (2025)
by: Govindarajan, Prashant, et al.
Published: (2025)
CoPeP: Benchmarking Continual Pretraining for Protein Language Models
by: Patil, Darshan, et al.
Published: (2026)
by: Patil, Darshan, et al.
Published: (2026)
Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
by: Malviya, Pranshu, et al.
Published: (2024)
by: Malviya, Pranshu, et al.
Published: (2024)
Dialectics of Alignment: Harnessing Unsafe Knowledge for Dynamic Safety Routing
by: Hashemzadeh, Maryam, et al.
Published: (2026)
by: Hashemzadeh, Maryam, et al.
Published: (2026)
NovoMolGen: Rethinking Molecular Language Model Pretraining
by: Chitsaz, Kamran, et al.
Published: (2025)
by: Chitsaz, Kamran, et al.
Published: (2025)
NeoBERT: A Next-Generation BERT
by: Breton, Lola Le, et al.
Published: (2025)
by: Breton, Lola Le, et al.
Published: (2025)
Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models
by: Abbes, Istabrak, et al.
Published: (2025)
by: Abbes, Istabrak, et al.
Published: (2025)
Small Encoders Can Rival Large Decoders in Detecting Groundedness
by: Abbes, Istabrak, et al.
Published: (2025)
by: Abbes, Istabrak, et al.
Published: (2025)
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
by: Shen, Han, et al.
Published: (2024)
by: Shen, Han, et al.
Published: (2024)
What is the Best Process Model Representation? A Comparative Analysis for Process Modeling with Large Language Models
by: Brissard, Alexis, et al.
Published: (2025)
by: Brissard, Alexis, et al.
Published: (2025)
Towards Practical Tool Usage for Continually Learning LLMs
by: Huang, Jerry, et al.
Published: (2024)
by: Huang, Jerry, et al.
Published: (2024)
Reducing Hallucinations in Language Model-based SPARQL Query Generation Using Post-Generation Memory Retrieval
by: Sharma, Aditya, et al.
Published: (2025)
by: Sharma, Aditya, et al.
Published: (2025)
FRASE: Structured Representations for Generalizable SPARQL Query Generation
by: Diallo, Papa Abdou Karim Karou, et al.
Published: (2025)
by: Diallo, Papa Abdou Karim Karou, et al.
Published: (2025)
Enhancing Frame Detection with Retrieval Augmented Generation
by: Diallo, Papa Abdou Karim Karou, et al.
Published: (2025)
by: Diallo, Papa Abdou Karim Karou, et al.
Published: (2025)
Faithfulness Measurable Masked Language Models
by: Madsen, Andreas, et al.
Published: (2023)
by: Madsen, Andreas, et al.
Published: (2023)
Are self-explanations from Large Language Models faithful?
by: Madsen, Andreas, et al.
Published: (2024)
by: Madsen, Andreas, et al.
Published: (2024)
The Unintended Trade-off of AI Alignment:Balancing Hallucination Mitigation and Safety in LLMs
by: Mahmoud, Omar, et al.
Published: (2025)
by: Mahmoud, Omar, et al.
Published: (2025)
Patching LLM Like Software: A Lightweight Method for Improving Safety Policy in Large Language Models
by: Arif, Huzaifa, et al.
Published: (2025)
by: Arif, Huzaifa, et al.
Published: (2025)
Position: Theory of Mind Benchmarks are Broken for Large Language Models
by: Riemer, Matthew, et al.
Published: (2024)
by: Riemer, Matthew, et al.
Published: (2024)
A Comprehensive Evaluation of Neural SPARQL Query Generation from Natural Language Questions
by: Diallo, Papa Abdou Karim Karou, et al.
Published: (2023)
by: Diallo, Papa Abdou Karim Karou, et al.
Published: (2023)
Reconstruction or Semantics? What Makes a Latent Space Useful for Robotic World Models
by: Nilaksh, et al.
Published: (2026)
by: Nilaksh, et al.
Published: (2026)
NeuroFaith: Evaluating LLM Self-Explanation Faithfulness via Internal Representation Alignment
by: Bhan, Milan, et al.
Published: (2025)
by: Bhan, Milan, et al.
Published: (2025)
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
by: Boisvert, Léo, et al.
Published: (2024)
by: Boisvert, Léo, et al.
Published: (2024)
Neural Coherence : Find higher performance to out-of-distribution tasks from few samples
by: Guiroy, Simon, et al.
Published: (2025)
by: Guiroy, Simon, et al.
Published: (2025)
Effect of Document Packing on the Latent Multi-Hop Reasoning Capabilities of Large Language Models
by: Prato, Gabriele, et al.
Published: (2025)
by: Prato, Gabriele, et al.
Published: (2025)
The Expressive Limits of Diagonal SSMs for State-Tracking
by: Shakerinava, Mehran, et al.
Published: (2026)
by: Shakerinava, Mehran, et al.
Published: (2026)
Intelligent Switching for Reset-Free RL
by: Patil, Darshan, et al.
Published: (2024)
by: Patil, Darshan, et al.
Published: (2024)
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models
by: Huang, Jerry, et al.
Published: (2024)
by: Huang, Jerry, et al.
Published: (2024)
Interpretability Needs a New Paradigm
by: Madsen, Andreas, et al.
Published: (2024)
by: Madsen, Andreas, et al.
Published: (2024)
Why Don't Prompt-Based Fairness Metrics Correlate?
by: Zayed, Abdelrahman, et al.
Published: (2024)
by: Zayed, Abdelrahman, et al.
Published: (2024)
Should We Attend More or Less? Modulating Attention for Fairness
by: Zayed, Abdelrahman, et al.
Published: (2023)
by: Zayed, Abdelrahman, et al.
Published: (2023)
Lookbehind-SAM: k steps back, 1 step forward
by: Mordido, Gonçalo, et al.
Published: (2023)
by: Mordido, Gonçalo, et al.
Published: (2023)
PEEL the Layers and Find Yourself: Revisiting Inference-time Data Leakage for Residual Neural Networks
by: Arif, Huzaifa, et al.
Published: (2025)
by: Arif, Huzaifa, et al.
Published: (2025)
Similar Items
-
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
by: Thakkar, Megh, et al.
Published: (2024) -
LLMs Can't Play Hangman: On the Necessity of a Private Working Memory for Language Agents
by: Baldelli, Davide, et al.
Published: (2026) -
Probabilistic Calibration Is a Trainable Capability in Language Models
by: Baldelli, Davide, et al.
Published: (2026) -
Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models
by: Chen, Pin-Yu, et al.
Published: (2025) -
Exploring Quantization for Efficient Pre-Training of Transformer Language Models
by: Chitsaz, Kamran, et al.
Published: (2024)