Saved in:
| Main Authors: | Ein-Dor, Liat, Toledo-Ronen, Orith, Spector, Artem, Gretz, Shai, Dankin, Lena, Halfon, Alon, Katz, Yoav, Slonim, Noam |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.04560 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Stay Tuned: An Empirical Study of the Impact of Hyperparameters on LLM Tuning in Real-World Applications
by: Halfon, Alon, et al.
Published: (2024)
by: Halfon, Alon, et al.
Published: (2024)
Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness
by: Ashuach, Tomer, et al.
Published: (2026)
by: Ashuach, Tomer, et al.
Published: (2026)
Multi-Domain Explainability of Preferences
by: Calderon, Nitay, et al.
Published: (2025)
by: Calderon, Nitay, et al.
Published: (2025)
Efficient Benchmarking of Language Models
by: Perlitz, Yotam, et al.
Published: (2023)
by: Perlitz, Yotam, et al.
Published: (2023)
WildIFEval: Instruction Following in the Wild
by: Lior, Gili, et al.
Published: (2025)
by: Lior, Gili, et al.
Published: (2025)
Fine-Grained Detection of Context-Grounded Hallucinations Using LLMs
by: Peisakhovsky, Yehonatan, et al.
Published: (2025)
by: Peisakhovsky, Yehonatan, et al.
Published: (2025)
Label-Efficient Model Selection for Text Generation
by: Ashury-Tahan, Shir, et al.
Published: (2024)
by: Ashury-Tahan, Shir, et al.
Published: (2024)
Debatable Intelligence: Benchmarking LLM Judges via Debate Speech Evaluation
by: Sternlicht, Noy, et al.
Published: (2025)
by: Sternlicht, Noy, et al.
Published: (2025)
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
by: Levy, Mosh, et al.
Published: (2024)
by: Levy, Mosh, et al.
Published: (2024)
Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature
by: Katz, Uri, et al.
Published: (2024)
by: Katz, Uri, et al.
Published: (2024)
Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning
by: Alon, Yoav, et al.
Published: (2024)
by: Alon, Yoav, et al.
Published: (2024)
Transformers for Program Termination
by: Alon, Yoav, et al.
Published: (2026)
by: Alon, Yoav, et al.
Published: (2026)
Artificial Expert Intelligence through PAC-reasoning
by: Shalev-Shwartz, Shai, et al.
Published: (2024)
by: Shalev-Shwartz, Shai, et al.
Published: (2024)
NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings
by: Shachar, Or, et al.
Published: (2025)
by: Shachar, Or, et al.
Published: (2025)
PromptSuite: A Task-Agnostic Framework for Multi-Prompt Generation
by: Habba, Eliya, et al.
Published: (2025)
by: Habba, Eliya, et al.
Published: (2025)
Fundamental Limitations of Alignment in Large Language Models
by: Wolf, Yotam, et al.
Published: (2023)
by: Wolf, Yotam, et al.
Published: (2023)
The Branch Not Taken: Predicting Branching in Online Conversations
by: Meital, Shai, et al.
Published: (2024)
by: Meital, Shai, et al.
Published: (2024)
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers
by: Yehuda, Yakir, et al.
Published: (2024)
by: Yehuda, Yakir, et al.
Published: (2024)
Enhancing Depression Detection via Question-wise Modality Fusion
by: Mandal, Aishik, et al.
Published: (2025)
by: Mandal, Aishik, et al.
Published: (2025)
Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs
by: Hua, Yilun, et al.
Published: (2024)
by: Hua, Yilun, et al.
Published: (2024)
Conversation Routines: A Prompt Engineering Framework for Task-Oriented Dialog Systems
by: Robino, Giorgio
Published: (2025)
by: Robino, Giorgio
Published: (2025)
A Proof-Producing Compiler for Blockchain Applications
by: Avigad, Jeremy, et al.
Published: (2025)
by: Avigad, Jeremy, et al.
Published: (2025)
Generating Benchmarks for Factuality Evaluation of Language Models
by: Muhlgay, Dor, et al.
Published: (2023)
by: Muhlgay, Dor, et al.
Published: (2023)
Tradeoffs Between Alignment and Helpfulness in Language Models with Steering Methods
by: Wolf, Yotam, et al.
Published: (2024)
by: Wolf, Yotam, et al.
Published: (2024)
Compared to What? Baselines and Metrics for Counterfactual Prompting
by: Yang, Zihao, et al.
Published: (2026)
by: Yang, Zihao, et al.
Published: (2026)
Teaching Models to Improve on Tape
by: Bezalel, Liat, et al.
Published: (2024)
by: Bezalel, Liat, et al.
Published: (2024)
Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles
by: Gabay, Adi, et al.
Published: (2026)
by: Gabay, Adi, et al.
Published: (2026)
Temporal reasoning for timeline summarisation in social media
by: Song, Jiayu, et al.
Published: (2024)
by: Song, Jiayu, et al.
Published: (2024)
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
by: Gottesman, Daniela, et al.
Published: (2025)
by: Gottesman, Daniela, et al.
Published: (2025)
Tailored Emotional LLM-Supporter: Enhancing Cultural Sensitivity
by: Liu, Chen Cecilia, et al.
Published: (2025)
by: Liu, Chen Cecilia, et al.
Published: (2025)
"What's my model inside of?": Exploring the role of environments for grounded natural language understanding
by: Tamari, Ronen
Published: (2024)
by: Tamari, Ronen
Published: (2024)
Zonkey: A Hierarchical Diffusion Language Model with Differentiable Tokenization and Probabilistic Attention
by: Rozental, Alon
Published: (2026)
by: Rozental, Alon
Published: (2026)
Reverse Prompt Engineering
by: Li, Hanqing, et al.
Published: (2024)
by: Li, Hanqing, et al.
Published: (2024)
FormulaOne: Measuring the Depth of Algorithmic Reasoning Beyond Competitive Programming
by: Beniamini, Gal, et al.
Published: (2025)
by: Beniamini, Gal, et al.
Published: (2025)
Prompt Engineering a Prompt Engineer
by: Ye, Qinyuan, et al.
Published: (2023)
by: Ye, Qinyuan, et al.
Published: (2023)
Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering
by: Polak, Maciej P., et al.
Published: (2023)
by: Polak, Maciej P., et al.
Published: (2023)
Jamba: A Hybrid Transformer-Mamba Language Model
by: Lieber, Opher, et al.
Published: (2024)
by: Lieber, Opher, et al.
Published: (2024)
Isoperimetric Inequalities Made Simpler
by: Eldan, Ronen, et al.
Published: (2022)
by: Eldan, Ronen, et al.
Published: (2022)
The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models
by: Kirsanov, Artem, et al.
Published: (2025)
by: Kirsanov, Artem, et al.
Published: (2025)
Prompt Engineering: How Prompt Vocabulary affects Domain Knowledge
by: Schreiter, Dimitri
Published: (2025)
by: Schreiter, Dimitri
Published: (2025)
Similar Items
-
Stay Tuned: An Empirical Study of the Impact of Hyperparameters on LLM Tuning in Real-World Applications
by: Halfon, Alon, et al.
Published: (2024) -
Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness
by: Ashuach, Tomer, et al.
Published: (2026) -
Multi-Domain Explainability of Preferences
by: Calderon, Nitay, et al.
Published: (2025) -
Efficient Benchmarking of Language Models
by: Perlitz, Yotam, et al.
Published: (2023) -
WildIFEval: Instruction Following in the Wild
by: Lior, Gili, et al.
Published: (2025)