Saved in:
| Main Authors: | Koshorek, Omri, Granot, Niv, Alloni, Aviv, Admati, Shahar, Hendel, Roee, Weiss, Ido, Arazi, Alan, Cohen, Shay-Nitzan, Belinkov, Yonatan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.08505 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Differentiable Faithfulness Alignment for Cross-Model Circuit Transfer
by: Shao, Shun, et al.
Published: (2026)
by: Shao, Shun, et al.
Published: (2026)
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
by: Katz, Shahar, et al.
Published: (2024)
by: Katz, Shahar, et al.
Published: (2024)
Old Habits Die Hard: How Conversational History Geometrically Traps LLMs
by: Simhi, Adi, et al.
Published: (2026)
by: Simhi, Adi, et al.
Published: (2026)
ContraSim -- Analyzing Neural Representations Based on Contrastive Learning
by: Rahamim, Adir, et al.
Published: (2023)
by: Rahamim, Adir, et al.
Published: (2023)
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers
by: Yona, Gal, et al.
Published: (2024)
by: Yona, Gal, et al.
Published: (2024)
Unsupervised Translation of Emergent Communication
by: Levy, Ido, et al.
Published: (2025)
by: Levy, Ido, et al.
Published: (2025)
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
by: Wiegreffe, Sarah, et al.
Published: (2024)
by: Wiegreffe, Sarah, et al.
Published: (2024)
Back Attention: Understanding and Enhancing Multi-Hop Reasoning in Large Language Models
by: Yu, Zeping, et al.
Published: (2025)
by: Yu, Zeping, et al.
Published: (2025)
Concept-Best-Matching: Evaluating Compositionality in Emergent Communication
by: Carmeli, Boaz, et al.
Published: (2024)
by: Carmeli, Boaz, et al.
Published: (2024)
Improving LLM Reliability with RAG in Religious Question-Answering: MufassirQAS
by: Alan, Ahmet Yusuf, et al.
Published: (2024)
by: Alan, Ahmet Yusuf, et al.
Published: (2024)
Are formal and functional linguistic mechanisms dissociated in language models?
by: Hanna, Michael, et al.
Published: (2025)
by: Hanna, Michael, et al.
Published: (2025)
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
by: Ashuach, Tomer, et al.
Published: (2024)
by: Ashuach, Tomer, et al.
Published: (2024)
SAEs Are Good for Steering -- If You Select the Right Features
by: Arad, Dana, et al.
Published: (2025)
by: Arad, Dana, et al.
Published: (2025)
Leveraging Prototypical Representations for Mitigating Social Bias without Demographic Information
by: Iskander, Shadi, et al.
Published: (2024)
by: Iskander, Shadi, et al.
Published: (2024)
Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs
by: Itzhak, Itay, et al.
Published: (2025)
by: Itzhak, Itay, et al.
Published: (2025)
Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms
by: Hanna, Michael, et al.
Published: (2024)
by: Hanna, Michael, et al.
Published: (2024)
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
by: Toker, Michael, et al.
Published: (2025)
by: Toker, Michael, et al.
Published: (2025)
QA-Noun: Representing Nominal Semantics via Natural Language Question-Answer Pairs
by: Tseytlin, Maria, et al.
Published: (2025)
by: Tseytlin, Maria, et al.
Published: (2025)
From RAG to Agentic RAG for Faithful Islamic Question Answering
by: Bhatia, Gagan, et al.
Published: (2026)
by: Bhatia, Gagan, et al.
Published: (2026)
DEPTH: Discourse Education through Pre-Training Hierarchically
by: Bamberger, Zachary, et al.
Published: (2024)
by: Bamberger, Zachary, et al.
Published: (2024)
ReFACT: Updating Text-to-Image Models by Editing the Text Encoder
by: Arad, Dana, et al.
Published: (2023)
by: Arad, Dana, et al.
Published: (2023)
MultiCube-RAG for Multi-hop Question Answering
by: Shi, Jimeng, et al.
Published: (2026)
by: Shi, Jimeng, et al.
Published: (2026)
Fast Forwarding Low-Rank Training
by: Rahamim, Adir, et al.
Published: (2024)
by: Rahamim, Adir, et al.
Published: (2024)
Distinguishing Ignorance from Error in LLM Hallucinations
by: Simhi, Adi, et al.
Published: (2024)
by: Simhi, Adi, et al.
Published: (2024)
Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs
by: Simhi, Adi, et al.
Published: (2024)
by: Simhi, Adi, et al.
Published: (2024)
Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
by: Tutek, Martin, et al.
Published: (2025)
by: Tutek, Martin, et al.
Published: (2025)
Consensus or Conflict? Fine-Grained Evaluation of Conflicting Answers in Question-Answering
by: Nachshoni, Eviatar, et al.
Published: (2025)
by: Nachshoni, Eviatar, et al.
Published: (2025)
SemRAG: Semantic Knowledge-Augmented RAG for Improved Question-Answering
by: Zhong, Kezhen, et al.
Published: (2025)
by: Zhong, Kezhen, et al.
Published: (2025)
From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs
by: Itzhak, Itay, et al.
Published: (2026)
by: Itzhak, Itay, et al.
Published: (2026)
Augmenting Question Answering with A Hybrid RAG Approach
by: Yang, Tianyi, et al.
Published: (2026)
by: Yang, Tianyi, et al.
Published: (2026)
MedCoT-RAG: Causal Chain-of-Thought RAG for Medical Question Answering
by: Wang, Ziyu, et al.
Published: (2025)
by: Wang, Ziyu, et al.
Published: (2025)
Accelerating the Global Aggregation of Local Explanations
by: Mor, Alon, et al.
Published: (2023)
by: Mor, Alon, et al.
Published: (2023)
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
by: Nikankin, Yaniv, et al.
Published: (2024)
by: Nikankin, Yaniv, et al.
Published: (2024)
Same Task, Different Circuits: Disentangling Modality-Specific Mechanisms in VLMs
by: Nikankin, Yaniv, et al.
Published: (2025)
by: Nikankin, Yaniv, et al.
Published: (2025)
Chronological Passage Assembling in RAG framework for Temporal Question Answering
by: Kim, Byeongjeong, et al.
Published: (2025)
by: Kim, Byeongjeong, et al.
Published: (2025)
OCC-RAG: Optimal Cognitive Core for Faithful Question Answering
by: Savkin, Maksim, et al.
Published: (2026)
by: Savkin, Maksim, et al.
Published: (2026)
Silent Tokens, Loud Effects: Padding in LLMs
by: Himelstein, Rom, et al.
Published: (2025)
by: Himelstein, Rom, et al.
Published: (2025)
Reading Between the Timelines: RAG for Answering Diachronic Questions
by: Lau, Kwun Hang, et al.
Published: (2025)
by: Lau, Kwun Hang, et al.
Published: (2025)
TabSTAR: A Tabular Foundation Model for Tabular Data with Text Fields
by: Arazi, Alan, et al.
Published: (2025)
by: Arazi, Alan, et al.
Published: (2025)
Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods
by: Blau, Tsachi, et al.
Published: (2024)
by: Blau, Tsachi, et al.
Published: (2024)
Similar Items
-
Differentiable Faithfulness Alignment for Cross-Model Circuit Transfer
by: Shao, Shun, et al.
Published: (2026) -
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
by: Katz, Shahar, et al.
Published: (2024) -
Old Habits Die Hard: How Conversational History Geometrically Traps LLMs
by: Simhi, Adi, et al.
Published: (2026) -
ContraSim -- Analyzing Neural Representations Based on Contrastive Learning
by: Rahamim, Adir, et al.
Published: (2023) -
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers
by: Yona, Gal, et al.
Published: (2024)