:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gan, Yidong, Rybinski, Maciej, Hachey, Ben, Kummerfeld, Jonathan K.
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2412.18043
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Simple and Effective Baselines for Code Summarisation Evaluation
by: Robinson, Jade, et al.
Published: (2025)

An Empirical Analysis of Static Analysis Methods for Detection and Mitigation of Code Library Hallucinations
by: Miranda-Pena, Clarissa, et al.
Published: (2026)

MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction
by: Dai, Xiang, et al.
Published: (2024)

Using a Human-AI Teaming Approach to Create and Curate Scientific Datasets with the SCILIRE System
by: Bölücü, Necva, et al.
Published: (2026)

Your Students Don't Use LLMs Like You Wish They Did
by: Kobler, Sebastian, et al.
Published: (2026)

Personalized Help for Optimizing Low-Skilled Users' Strategy
by: Gu, Feng, et al.
Published: (2024)

SQLucid: Grounding Natural Language Database Queries with Interactive Explanations
by: Tian, Yuan, et al.
Published: (2024)

Aligning Stuttered-Speech Research with End-User Needs: Scoping Review, Survey, and Guidelines
by: Toyin, Hawau Olamide, et al.
Published: (2026)

Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks
by: Xie, Wenya, et al.
Published: (2025)

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
by: Lee, Andrew, et al.
Published: (2024)

Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL
by: Wongkamjan, Wichayaporn, et al.
Published: (2025)

Do Text-to-Vis Benchmarks Test Real Use of Visualisations?
by: Nguyen, Hy, et al.
Published: (2024)

Care from a Cognitive Perspective
by: Hachey, Alyse C.
Published: (2012)

Code Review Without Borders: Evaluating Synthetic vs. Real Data for Review Recommendation
by: Cohen, Yogev, et al.
Published: (2025)

Enhancing LLM Medical Coding with Structured External Knowledge
by: Gan, Yidong, et al.
Published: (2026)

Exploring Self-Identified Counseling Expertise in Online Support Forums
by: Lahnala, Allison, et al.
Published: (2021)

Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations
by: Tian, Yuan, et al.
Published: (2023)

Deep Literature Survey Automation with an Iterative Workflow
by: Zhang, Hongbo, et al.
Published: (2025)

AI-Resilient Interfaces
by: Glassman, Elena L., et al.
Published: (2024)

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
by: Xu, Wanghan, et al.
Published: (2025)

A Critical Evaluation of AI Feedback for Aligning Large Language Models
by: Sharma, Archit, et al.
Published: (2024)

More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play
by: Wongkamjan, Wichayaporn, et al.
Published: (2024)

A Critical Review of the Need for Knowledge-Centric Evaluation of Quranic Recitation
by: Al-Kharusi, Mohammed Hilal, et al.
Published: (2025)

TrustDataFilter:Leveraging Trusted Knowledge Base Data for More Effective Filtering of Unknown Information
by: Zhang, Jinghong, et al.
Published: (2025)

FlowCompile: An Optimizing Compiler for Structured LLM Workflows
by: Li, Junyan, et al.
Published: (2026)

Aligning Language Models with Clinical Expertise: DPO for Heart Failure Nursing Documentation in Critical Care
by: Fan, Junyi, et al.
Published: (2025)

Aligning Netlist to Source Code using SynAlign
by: Garg, Sakshi, et al.
Published: (2025)

Coding-Free and Privacy-Preserving Agentic Framework for Data-Driven Clinical Research
by: Kim, Taehun, et al.
Published: (2026)

AI Coding Agents Need Better Compiler Remarks
by: Deo, Akash, et al.
Published: (2026)

Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models
by: Zhang, Yuzhe, et al.
Published: (2024)

DualAlign: Generating Clinically Grounded Synthetic Data
by: Li, Rumeng, et al.
Published: (2025)

Chord Embeddings: Analyzing What They Capture and Their Role for Next Chord Prediction and Artist Attribute Prediction
by: Lahnala, Allison, et al.
Published: (2021)

Reasoning on Multiple Needles In A Haystack
by: Wang, Yidong
Published: (2025)

Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents
by: Choubey, Prafulla Kumar, et al.
Published: (2025)

Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering
by: Polak, Maciej P., et al.
Published: (2023)

Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases
by: Xu, Shanshan, et al.
Published: (2025)

Transformer-Based Extraction of Statutory Definitions from the U.S. Code
by: Hosabettu, Arpana, et al.
Published: (2025)

NodeSynth: Socially Aligned Synthetic Data for AI Evaluation
by: Rashid, Qazi Mamunur, et al.
Published: (2026)

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
by: Liang, Hao, et al.
Published: (2025)

Evaluating and Aligning CodeLLMs on Human Preference
by: Yang, Jian, et al.
Published: (2024)