Saved in:
| Main Authors: | Levy, Ido, Paradise, Orr, Carmeli, Boaz, Meir, Ron, Goldwasser, Shafi, Belinkov, Yonatan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.07552 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Investigating the Development of Task-Oriented Communication in Vision-Language Models
by: Carmeli, Boaz, et al.
Published: (2026)
by: Carmeli, Boaz, et al.
Published: (2026)
Concept-Best-Matching: Evaluating Compositionality in Emergent Communication
by: Carmeli, Boaz, et al.
Published: (2024)
by: Carmeli, Boaz, et al.
Published: (2024)
CtD: Composition through Decomposition in Emergent Communication
by: Carmeli, Boaz, et al.
Published: (2026)
by: Carmeli, Boaz, et al.
Published: (2026)
Semantics and Spatiality of Emergent Communication
by: Zion, Rotem Ben, et al.
Published: (2024)
by: Zion, Rotem Ben, et al.
Published: (2024)
Will it Merge? On The Causes of Model Mergeability
by: Rahamim, Adir, et al.
Published: (2026)
by: Rahamim, Adir, et al.
Published: (2026)
SAEs Are Good for Steering -- If You Select the Right Features
by: Arad, Dana, et al.
Published: (2025)
by: Arad, Dana, et al.
Published: (2025)
Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs
by: Itzhak, Itay, et al.
Published: (2025)
by: Itzhak, Itay, et al.
Published: (2025)
Models That Prove Their Own Correctness
by: Amit, Noga, et al.
Published: (2024)
by: Amit, Noga, et al.
Published: (2024)
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
by: Katz, Shahar, et al.
Published: (2024)
by: Katz, Shahar, et al.
Published: (2024)
From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs
by: Itzhak, Itay, et al.
Published: (2026)
by: Itzhak, Itay, et al.
Published: (2026)
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
by: Wiegreffe, Sarah, et al.
Published: (2024)
by: Wiegreffe, Sarah, et al.
Published: (2024)
DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation in Text-to-Image Models
by: Ventura, Mor, et al.
Published: (2025)
by: Ventura, Mor, et al.
Published: (2025)
On Non-interactive Evaluation of Animal Communication Translators
by: Paradise, Orr, et al.
Published: (2025)
by: Paradise, Orr, et al.
Published: (2025)
Learning Randomized Reductions
by: Erata, Ferhat, et al.
Published: (2024)
by: Erata, Ferhat, et al.
Published: (2024)
LLM-Human Pipeline for Cultural Context Grounding of Conversations
by: Pujari, Rajkumar, et al.
Published: (2024)
by: Pujari, Rajkumar, et al.
Published: (2024)
Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias
by: Itzhak, Itay, et al.
Published: (2023)
by: Itzhak, Itay, et al.
Published: (2023)
Old Habits Die Hard: How Conversational History Geometrically Traps LLMs
by: Simhi, Adi, et al.
Published: (2026)
by: Simhi, Adi, et al.
Published: (2026)
ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies
by: Sultan, Oren, et al.
Published: (2024)
by: Sultan, Oren, et al.
Published: (2024)
BlackboxNLP-2025 MIB Shared Task: Improving Circuit Faithfulness via Better Edge Selection
by: Nikankin, Yaniv, et al.
Published: (2025)
by: Nikankin, Yaniv, et al.
Published: (2025)
Emergent Communication Pretraining for Few-Shot Machine Translation
by: Li, Yaoyiran, et al.
Published: (2020)
by: Li, Yaoyiran, et al.
Published: (2020)
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
by: Marks, Samuel, et al.
Published: (2024)
by: Marks, Samuel, et al.
Published: (2024)
Genie: Achieving Human Parity in Content-Grounded Datasets Generation
by: Yehudai, Asaf, et al.
Published: (2024)
by: Yehudai, Asaf, et al.
Published: (2024)
Findings of the BlackboxNLP 2025 Shared Task: Localizing Circuits and Causal Variables in Language Models
by: Arad, Dana, et al.
Published: (2025)
by: Arad, Dana, et al.
Published: (2025)
CoLa: Learning to Interactively Collaborate with Large Language Models
by: Sharma, Abhishek, et al.
Published: (2025)
by: Sharma, Abhishek, et al.
Published: (2025)
Splits! Flexible Sociocultural Linguistic Investigation at Scale
by: Caplan, Eylon, et al.
Published: (2025)
by: Caplan, Eylon, et al.
Published: (2025)
Confidence Regulation Neurons in Language Models
by: Stolfo, Alessandro, et al.
Published: (2024)
by: Stolfo, Alessandro, et al.
Published: (2024)
EmoGist: Efficient In-Context Learning for Visual Emotion Understanding
by: Seoh, Ronald, et al.
Published: (2025)
by: Seoh, Ronald, et al.
Published: (2025)
Post-hoc Study of Climate Microtargeting on Social Media Ads with LLMs: Thematic Insights and Fairness Evaluation
by: Islam, Tunazzina, et al.
Published: (2024)
by: Islam, Tunazzina, et al.
Published: (2024)
Generating Benchmarks for Factuality Evaluation of Language Models
by: Muhlgay, Dor, et al.
Published: (2023)
by: Muhlgay, Dor, et al.
Published: (2023)
A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning
by: Gluch, Greg, et al.
Published: (2025)
by: Gluch, Greg, et al.
Published: (2025)
Unsupervised Representation Learning - an Invariant Risk Minimization Perspective
by: Norman, Yotam, et al.
Published: (2025)
by: Norman, Yotam, et al.
Published: (2025)
Can LLMs Assist Annotators in Identifying Morality Frames? -- Case Study on Vaccination Debate on Social Media
by: Islam, Tunazzina, et al.
Published: (2025)
by: Islam, Tunazzina, et al.
Published: (2025)
Discovering Latent Themes in Social Media Messaging: A Machine-in-the-Loop Approach Integrating LLMs
by: Islam, Tunazzina, et al.
Published: (2024)
by: Islam, Tunazzina, et al.
Published: (2024)
Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy
by: Islam, Tunazzina, et al.
Published: (2024)
by: Islam, Tunazzina, et al.
Published: (2024)
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
by: Orgad, Hadas, et al.
Published: (2024)
by: Orgad, Hadas, et al.
Published: (2024)
Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism
by: Orgad, Hadas, et al.
Published: (2026)
by: Orgad, Hadas, et al.
Published: (2026)
VideoAgent: Long-form Video Understanding with Large Language Model as Agent
by: Wang, Xiaohan, et al.
Published: (2024)
by: Wang, Xiaohan, et al.
Published: (2024)
ContraSim -- Analyzing Neural Representations Based on Contrastive Learning
by: Rahamim, Adir, et al.
Published: (2023)
by: Rahamim, Adir, et al.
Published: (2023)
Bridging the Gap in Bangla Healthcare: Machine Learning Based Disease Prediction Using a Symptoms-Disease Dataset
by: Zannat, Rowzatul, et al.
Published: (2026)
by: Zannat, Rowzatul, et al.
Published: (2026)
Truth is Universal: Robust Detection of Lies in LLMs
by: Bürger, Lennart, et al.
Published: (2024)
by: Bürger, Lennart, et al.
Published: (2024)
Similar Items
-
Investigating the Development of Task-Oriented Communication in Vision-Language Models
by: Carmeli, Boaz, et al.
Published: (2026) -
Concept-Best-Matching: Evaluating Compositionality in Emergent Communication
by: Carmeli, Boaz, et al.
Published: (2024) -
CtD: Composition through Decomposition in Emergent Communication
by: Carmeli, Boaz, et al.
Published: (2026) -
Semantics and Spatiality of Emergent Communication
by: Zion, Rotem Ben, et al.
Published: (2024) -
Will it Merge? On The Causes of Model Mergeability
by: Rahamim, Adir, et al.
Published: (2026)