Saved in:
| Main Authors: | Pungitore, Sarah, Yadav, Shashank, Subbian, Vignesh |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.19265 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Lightweight Language Models are Prone to Reasoning Errors for Complex Computational Phenotyping Tasks
by: Pungitore, Sarah, et al.
Published: (2025)
by: Pungitore, Sarah, et al.
Published: (2025)
SHREC: A Framework for Advancing Next-Generation Computational Phenotyping with Large Language Models
by: Pungitore, Sarah, et al.
Published: (2025)
by: Pungitore, Sarah, et al.
Published: (2025)
Failure Modes of Time Series Interpretability Algorithms for Critical Care Applications and Potential Solutions
by: Yadav, Shashank, et al.
Published: (2025)
by: Yadav, Shashank, et al.
Published: (2025)
FairLogue: Evaluating Intersectional Fairness across Clinical Machine Learning Use Cases using the All of Us Research Program
by: Souligne, Nick, et al.
Published: (2026)
by: Souligne, Nick, et al.
Published: (2026)
Evaluating Deep Unlearning in Large Language Models
by: Wu, Ruihan, et al.
Published: (2024)
by: Wu, Ruihan, et al.
Published: (2024)
A Hybrid Framework with Large Language Models for Rare Disease Phenotyping
by: Wu, Jinge, et al.
Published: (2024)
by: Wu, Jinge, et al.
Published: (2024)
MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education
by: Liu, Naiming, et al.
Published: (2024)
by: Liu, Naiming, et al.
Published: (2024)
Pedagogical Alignment of Large Language Models
by: Sonkar, Shashank, et al.
Published: (2024)
by: Sonkar, Shashank, et al.
Published: (2024)
Classifying Unreliable Narrators with Large Language Models
by: Brei, Anneliese, et al.
Published: (2025)
by: Brei, Anneliese, et al.
Published: (2025)
FairLogue: A Toolkit for Intersectional Fairness Analysis in Clinical Machine Learning Models
by: Souligne, Nick, et al.
Published: (2026)
by: Souligne, Nick, et al.
Published: (2026)
INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models
by: Kendapadi, Aum, et al.
Published: (2024)
by: Kendapadi, Aum, et al.
Published: (2024)
Iterative Learning of Computable Phenotypes for Treatment Resistant Hypertension using Large Language Models
by: Aldeia, Guilherme Seidyo Imai, et al.
Published: (2025)
by: Aldeia, Guilherme Seidyo Imai, et al.
Published: (2025)
GT2Vec: Large Language Models as Multi-Modal Encoders for Text and Graph-Structured Data
by: Lin, Jiacheng, et al.
Published: (2024)
by: Lin, Jiacheng, et al.
Published: (2024)
An Interpretable Ensemble of Graph and Language Models for Improving Search Relevance in E-Commerce
by: Choudhary, Nurendra, et al.
Published: (2024)
by: Choudhary, Nurendra, et al.
Published: (2024)
Identifying and Extracting Rare Disease Phenotypes with Large Language Models
by: Shyr, Cathy, et al.
Published: (2023)
by: Shyr, Cathy, et al.
Published: (2023)
GP-GPT: Large Language Model for Gene-Phenotype Mapping
by: Lyu, Yanjun, et al.
Published: (2024)
by: Lyu, Yanjun, et al.
Published: (2024)
SocialGaze: Improving the Integration of Human Social Norms in Large Language Models
by: Vijjini, Anvesh Rao, et al.
Published: (2024)
by: Vijjini, Anvesh Rao, et al.
Published: (2024)
The Point of No Return: Counterfactual Localization of Deceptive Commitment in Language-Model Reasoning
by: Merrill, Scott, et al.
Published: (2026)
by: Merrill, Scott, et al.
Published: (2026)
Discovery of Generalizable TBI Phenotypes Using Multivariate Time-Series Clustering
by: Ghaderi, Hamid, et al.
Published: (2024)
by: Ghaderi, Hamid, et al.
Published: (2024)
CLEAR-3K: Assessing Causal Explanatory Capabilities in Language Models
by: Liu, Naiming, et al.
Published: (2025)
by: Liu, Naiming, et al.
Published: (2025)
GFLean: An Autoformalisation Framework for Lean via GF
by: Pathak, Shashank
Published: (2024)
by: Pathak, Shashank
Published: (2024)
Evaluating Large Language Models on Computer Science University Exams in Data Structures
by: Gabay, Edan, et al.
Published: (2026)
by: Gabay, Edan, et al.
Published: (2026)
CAPE: Context-Aware Personality Evaluation Framework for Large Language Models
by: Sandhan, Jivnesh, et al.
Published: (2025)
by: Sandhan, Jivnesh, et al.
Published: (2025)
Evaluating Large Language Models on Rare Disease Diagnosis: A Case Study using House M.D
by: Gupta, Arsh, et al.
Published: (2025)
by: Gupta, Arsh, et al.
Published: (2025)
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
by: Hada, Rishav, et al.
Published: (2023)
by: Hada, Rishav, et al.
Published: (2023)
MalruleLib: Large-Scale Executable Misconception Reasoning with Step Traces for Modeling Student Thinking in Mathematics
by: Chen, Xinghe, et al.
Published: (2026)
by: Chen, Xinghe, et al.
Published: (2026)
Taxonomy-based CheckList for Large Language Model Evaluation
by: Zhang, Damin
Published: (2023)
by: Zhang, Damin
Published: (2023)
Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations?
by: Salami, Hossein, et al.
Published: (2024)
by: Salami, Hossein, et al.
Published: (2024)
Bengali Text Classification: An Evaluation of Large Language Model Approaches
by: Hoque, Md Mahmudul, et al.
Published: (2026)
by: Hoque, Md Mahmudul, et al.
Published: (2026)
Memory-based Language Models: An Efficient, Explainable, and Eco-friendly Approach to Large Language Modeling
by: Bosch, Antal van den, et al.
Published: (2025)
by: Bosch, Antal van den, et al.
Published: (2025)
Computational Reasoning of Large Language Models
by: Wu, Haitao, et al.
Published: (2025)
by: Wu, Haitao, et al.
Published: (2025)
BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence
by: Wu, Sean, et al.
Published: (2026)
by: Wu, Sean, et al.
Published: (2026)
GraphArena: Evaluating and Exploring Large Language Models on Graph Computation
by: Tang, Jianheng, et al.
Published: (2024)
by: Tang, Jianheng, et al.
Published: (2024)
MHSafeEval: Role-Aware Interaction-Level Evaluation of Mental Health Safety in Large Language Models
by: Lee, Suhyun, et al.
Published: (2026)
by: Lee, Suhyun, et al.
Published: (2026)
Eka-Eval: An Evaluation Framework for Low-Resource Multilingual Large Language Models
by: Sinha, Samridhi Raj, et al.
Published: (2025)
by: Sinha, Samridhi Raj, et al.
Published: (2025)
LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models
by: Hou, Ruijie, et al.
Published: (2025)
by: Hou, Ruijie, et al.
Published: (2025)
UFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models
by: Huang, Zhaoheng, et al.
Published: (2024)
by: Huang, Zhaoheng, et al.
Published: (2024)
FEEL: A Framework for Evaluating Emotional Support Capability with Large Language Models
by: Zhang, Huaiwen, et al.
Published: (2024)
by: Zhang, Huaiwen, et al.
Published: (2024)
Leveraging Weakly Annotated Data for Hate Speech Detection in Code-Mixed Hinglish: A Feasibility-Driven Transfer Learning Approach with Large Language Models
by: Yadav, Sargam, et al.
Published: (2024)
by: Yadav, Sargam, et al.
Published: (2024)
RECKON: Large-scale Reference-based Efficient Knowledge Evaluation for Large Language Model
by: Zhang, Lin, et al.
Published: (2025)
by: Zhang, Lin, et al.
Published: (2025)
Similar Items
-
Lightweight Language Models are Prone to Reasoning Errors for Complex Computational Phenotyping Tasks
by: Pungitore, Sarah, et al.
Published: (2025) -
SHREC: A Framework for Advancing Next-Generation Computational Phenotyping with Large Language Models
by: Pungitore, Sarah, et al.
Published: (2025) -
Failure Modes of Time Series Interpretability Algorithms for Critical Care Applications and Potential Solutions
by: Yadav, Shashank, et al.
Published: (2025) -
FairLogue: Evaluating Intersectional Fairness across Clinical Machine Learning Use Cases using the All of Us Research Program
by: Souligne, Nick, et al.
Published: (2026) -
Evaluating Deep Unlearning in Large Language Models
by: Wu, Ruihan, et al.
Published: (2024)