Saved in:
| Main Authors: | Kahana, Adar, Mathew, Jaya Susan, Bleik, Said, Reynolds, Jeremy, Elisha, Oren |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.01065 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Emotions Where Art Thou: Understanding and Characterizing the Emotional Latent Space of Large Language Models
by: Reichman, Benjamin, et al.
Published: (2025)
by: Reichman, Benjamin, et al.
Published: (2025)
Automatic Question & Answer Generation Using Generative Large Language Model (LLM)
by: Ehsan, Md. Alvee, et al.
Published: (2025)
by: Ehsan, Md. Alvee, et al.
Published: (2025)
Multilingual Medical Reasoning for Question Answering with Large Language Models
by: Ferrazzi, Pietro, et al.
Published: (2025)
by: Ferrazzi, Pietro, et al.
Published: (2025)
Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark
by: Mastrokostas, Charalampos, et al.
Published: (2026)
by: Mastrokostas, Charalampos, et al.
Published: (2026)
Enhancing Large Language Model Performance To Answer Questions and Extract Information More Accurately
by: Zhang, Liang, et al.
Published: (2024)
by: Zhang, Liang, et al.
Published: (2024)
From Answers to Questions: EQGBench for Evaluating LLMs' Educational Question Generation
by: Zhou, Chengliang, et al.
Published: (2025)
by: Zhou, Chengliang, et al.
Published: (2025)
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
by: Ovadia, Oded, et al.
Published: (2023)
by: Ovadia, Oded, et al.
Published: (2023)
Explicit Diversity Conditions for Effective Question Answer Generation with Large Language Models
by: Yadav, Vikas, et al.
Published: (2024)
by: Yadav, Vikas, et al.
Published: (2024)
TimelineKGQA: A Comprehensive Question-Answer Pair Generator for Temporal Knowledge Graphs
by: Sun, Qiang, et al.
Published: (2025)
by: Sun, Qiang, et al.
Published: (2025)
Multilingual Non-Factoid Question Answering with Answer Paragraph Selection
by: Mishra, Ritwik, et al.
Published: (2024)
by: Mishra, Ritwik, et al.
Published: (2024)
The Challenge of Achieving Attributability in Multilingual Table-to-Text Generation with Question-Answer Blueprints
by: Haussmann, Aden
Published: (2025)
by: Haussmann, Aden
Published: (2025)
Can Large Language Models Make the Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education
by: Henkel, Owen, et al.
Published: (2024)
by: Henkel, Owen, et al.
Published: (2024)
Finding Answers in Thought Matters: Revisiting Evaluation on Large Language Models with Reasoning
by: Jo, Hwiyeol, et al.
Published: (2025)
by: Jo, Hwiyeol, et al.
Published: (2025)
Multiple-Choice Question Generation Using Large Language Models: Methodology and Educator Insights
by: Biancini, Giorgio, et al.
Published: (2025)
by: Biancini, Giorgio, et al.
Published: (2025)
Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models?
by: Chen, Pinzhen, et al.
Published: (2024)
by: Chen, Pinzhen, et al.
Published: (2024)
Task-Centric Acceleration of Small-Language Models
by: Tsur, Dor, et al.
Published: (2026)
by: Tsur, Dor, et al.
Published: (2026)
MEQA: A Meta-Evaluation Framework for Question & Answer LLM Benchmarks
by: Veuthey, Jaime Raldua, et al.
Published: (2025)
by: Veuthey, Jaime Raldua, et al.
Published: (2025)
QGen Studio: An Adaptive Question-Answer Generation, Training and Evaluation Platform
by: Moses, Movina, et al.
Published: (2025)
by: Moses, Movina, et al.
Published: (2025)
Tourism Question Answer System in Indian Language using Domain-Adapted Foundation Models
by: Gatla, Praveen, et al.
Published: (2025)
by: Gatla, Praveen, et al.
Published: (2025)
DataAgent: Evaluating Large Language Models' Ability to Answer Zero-Shot, Natural Language Queries
by: Mishra, Manit, et al.
Published: (2024)
by: Mishra, Manit, et al.
Published: (2024)
Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ
by: Holtermann, Carolin, et al.
Published: (2024)
by: Holtermann, Carolin, et al.
Published: (2024)
Follow-Up Questions Improve Documents Generated by Large Language Models
by: Tix, Bernadette J
Published: (2024)
by: Tix, Bernadette J
Published: (2024)
An Empirical Evaluation of Large Language Models on Consumer Health Questions
by: Abrar, Moaiz, et al.
Published: (2024)
by: Abrar, Moaiz, et al.
Published: (2024)
Multilingual State Space Models for Structured Question Answering in Indic Languages
by: Vats, Arpita, et al.
Published: (2025)
by: Vats, Arpita, et al.
Published: (2025)
How do you know that? Teaching Generative Language Models to Reference Answers to Biomedical Questions
by: Bašaragin, Bojana, et al.
Published: (2024)
by: Bašaragin, Bojana, et al.
Published: (2024)
Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions
by: Wiegreffe, Sarah, et al.
Published: (2024)
by: Wiegreffe, Sarah, et al.
Published: (2024)
Evaluating Large Language Models for Detecting Antisemitism
by: Patel, Jay, et al.
Published: (2025)
by: Patel, Jay, et al.
Published: (2025)
Facts Do Care About Your Language: Assessing Answer Quality of Multilingual LLMs
by: Kansal, Yuval, et al.
Published: (2025)
by: Kansal, Yuval, et al.
Published: (2025)
Not Your Typical Sycophant: The Elusive Nature of Sycophancy in Large Language Models
by: Natan, Shahar Ben, et al.
Published: (2026)
by: Natan, Shahar Ben, et al.
Published: (2026)
Beyond Questions: Evaluating What Large Language Models (Actually) Know
by: Giordano, Luca, et al.
Published: (2026)
by: Giordano, Luca, et al.
Published: (2026)
SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models
by: Lai, Peichao, et al.
Published: (2025)
by: Lai, Peichao, et al.
Published: (2025)
POLYCHARTQA: Benchmarking Large Vision-Language Models with Multilingual Chart Question Answering
by: Xu, Yichen, et al.
Published: (2025)
by: Xu, Yichen, et al.
Published: (2025)
No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes
by: Cencerrado, Iván Vicente Moreno, et al.
Published: (2025)
by: Cencerrado, Iván Vicente Moreno, et al.
Published: (2025)
The Roles of English in Evaluating Multilingual Language Models
by: Poelman, Wessel, et al.
Published: (2024)
by: Poelman, Wessel, et al.
Published: (2024)
Evaluating the Limits of Large Language Models in Multilingual Legal Reasoning
by: Ioannou, Antreas, et al.
Published: (2025)
by: Ioannou, Antreas, et al.
Published: (2025)
Multilingual Collaborative Defense for Large Language Models
by: Li, Hongliang, et al.
Published: (2025)
by: Li, Hongliang, et al.
Published: (2025)
When Answers Stray from Questions: Hallucination Detection via Question-Answer Orthogonal Decomposition
by: Yao, Siyang, et al.
Published: (2026)
by: Yao, Siyang, et al.
Published: (2026)
MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property
by: Ni, Shiwen, et al.
Published: (2024)
by: Ni, Shiwen, et al.
Published: (2024)
The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models
by: Katzy, Jonathan, et al.
Published: (2025)
by: Katzy, Jonathan, et al.
Published: (2025)
Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models
by: Chen, Yuyan, et al.
Published: (2024)
by: Chen, Yuyan, et al.
Published: (2024)
Similar Items
-
Emotions Where Art Thou: Understanding and Characterizing the Emotional Latent Space of Large Language Models
by: Reichman, Benjamin, et al.
Published: (2025) -
Automatic Question & Answer Generation Using Generative Large Language Model (LLM)
by: Ehsan, Md. Alvee, et al.
Published: (2025) -
Multilingual Medical Reasoning for Question Answering with Large Language Models
by: Ferrazzi, Pietro, et al.
Published: (2025) -
Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark
by: Mastrokostas, Charalampos, et al.
Published: (2026) -
Enhancing Large Language Model Performance To Answer Questions and Extract Information More Accurately
by: Zhang, Liang, et al.
Published: (2024)