:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jacovi, Alon, Bitton, Yonatan, Bohnet, Bernd, Herzig, Jonathan, Honovich, Or, Tseng, Michael, Collins, Michael, Aharoni, Roee, Geva, Mor
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2402.00559
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers
by: Yona, Gal, et al.
Published: (2024)

Can Large Language Models Faithfully Express Their Intrinsic Uncertainty in Words?
by: Yona, Gal, et al.
Published: (2024)

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
by: Gekhman, Zorik, et al.
Published: (2026)

Keep Guessing? When Considering Inference Scaling, Mind the Baselines
by: Yona, Gal, et al.
Published: (2024)

DRAGged into Conflicts: Detecting and Addressing Conflicting Sources in Search-Augmented LLMs
by: Cattan, Arie, et al.
Published: (2025)

CoverBench: A Challenging Benchmark for Complex Claim Verification
by: Jacovi, Alon, et al.
Published: (2024)

Marketing the Librarian: The Weakest Link in the Chain.
by: Kies, Cosette
Published: (1989)

Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models
by: Krishna, Arjun, et al.
Published: (2025)

FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning
by: Xie, Zhuohan, et al.
Published: (2025)

DoubleDipper: Improving Long-Context LLMs via Context Recycling
by: Cattan, Arie, et al.
Published: (2024)

Verifying Chain-of-Thought Reasoning via Its Computational Graph
by: Zhao, Zheng, et al.
Published: (2025)

On Learning Verifiers and Implications to Chain-of-Thought Reasoning
by: Balcan, Maria-Florina, et al.
Published: (2025)

Measuring Chain of Thought Faithfulness by Unlearning Reasoning Steps
by: Tutek, Martin, et al.
Published: (2025)

Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability
by: Aggarwal, Shashank, et al.
Published: (2026)

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
by: Katz, Shahar, et al.
Published: (2024)

Accelerating the Global Aggregation of Local Explanations
by: Mor, Alon, et al.
Published: (2023)

MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification
by: Sun, Linzhuang, et al.
Published: (2025)

NL-Eye: Abductive NLI for Images
by: Ventura, Mor, et al.
Published: (2024)

Universal Jailbreak Suffixes Are Strong Attention Hijackers
by: Ben-Tov, Matan, et al.
Published: (2025)

Multilingual Instruction Tuning With Just a Pinch of Multilinguality
by: Shaham, Uri, et al.
Published: (2024)

mFACE: Multilingual Summarization with Factual Consistency Evaluation
by: Aharoni, Roee, et al.
Published: (2022)

Representation Surgery: Theory and Practice of Affine Steering
by: Singh, Shashwat, et al.
Published: (2024)

EgoCoT-Bench: Benchmarking Grounded and Verifiable Operation-Centric Chain of Thought Reasoning for MLLMs
by: Dai, Yang, et al.
Published: (2026)

Typed Chain-of-Thought: A Curry-Howard Framework for Verifying LLM Reasoning
by: Perrier, Elija
Published: (2025)

Latent Reasoning with Supervised Thinking States
by: Amos, Ido, et al.
Published: (2026)

Editor's Choice: Evaluating Abstract Intent in Image Editing through Atomic Entity Analysis
by: Ventura, Mor, et al.
Published: (2026)

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools
by: Caciularu, Avi, et al.
Published: (2024)

Compositional Chain-of-Thought Prompting for Large Multimodal Models
by: Mitra, Chancharik, et al.
Published: (2023)

How Well Can Reasoning Models Identify and Recover from Unhelpful Thoughts?
by: Yang, Sohee, et al.
Published: (2025)

EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits
by: Yosef, Ron, et al.
Published: (2025)

Honest Students from Untrusted Teachers: Learning an Interpretable Question-Answering Pipeline from a Pretrained Language Model
by: Eisenstein, Jacob, et al.
Published: (2022)

CoRGI: Verified Chain-of-Thought Reasoning with Post-hoc Visual Grounding
by: Yi, Shixin, et al.
Published: (2025)

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
by: Gekhman, Zorik, et al.
Published: (2024)

Estimating Knowledge in Large Language Models Without Generating a Single Token
by: Gottesman, Daniela, et al.
Published: (2024)

Inferring Functionality of Attention Heads from their Parameters
by: Elhelo, Amit, et al.
Published: (2024)

The Weakest Link: Library Catalogs.
by: Young, Terrence E., Jr.
Published: (2002)

Generating Verifiable Chain of Thoughts from Exection-Traces
by: Thakur, Shailja, et al.
Published: (2025)

Fractured Chain-of-Thought Reasoning
by: Liao, Baohao, et al.
Published: (2025)

GeoChain: Multimodal Chain-of-Thought for Geographic Reasoning
by: Yerramilli, Sahiti, et al.
Published: (2025)

LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning
by: Motwani, Sumeet Ramesh, et al.
Published: (2026)