Saved in:
| Main Authors: | Gan, Yidong, Rybinski, Maciej, Hachey, Ben, Kummerfeld, Jonathan K. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.18043 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Simple and Effective Baselines for Code Summarisation Evaluation
by: Robinson, Jade, et al.
Published: (2025)
by: Robinson, Jade, et al.
Published: (2025)
An Empirical Analysis of Static Analysis Methods for Detection and Mitigation of Code Library Hallucinations
by: Miranda-Pena, Clarissa, et al.
Published: (2026)
by: Miranda-Pena, Clarissa, et al.
Published: (2026)
MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction
by: Dai, Xiang, et al.
Published: (2024)
by: Dai, Xiang, et al.
Published: (2024)
Using a Human-AI Teaming Approach to Create and Curate Scientific Datasets with the SCILIRE System
by: Bölücü, Necva, et al.
Published: (2026)
by: Bölücü, Necva, et al.
Published: (2026)
Your Students Don't Use LLMs Like You Wish They Did
by: Kobler, Sebastian, et al.
Published: (2026)
by: Kobler, Sebastian, et al.
Published: (2026)
Personalized Help for Optimizing Low-Skilled Users' Strategy
by: Gu, Feng, et al.
Published: (2024)
by: Gu, Feng, et al.
Published: (2024)
SQLucid: Grounding Natural Language Database Queries with Interactive Explanations
by: Tian, Yuan, et al.
Published: (2024)
by: Tian, Yuan, et al.
Published: (2024)
Aligning Stuttered-Speech Research with End-User Needs: Scoping Review, Survey, and Guidelines
by: Toyin, Hawau Olamide, et al.
Published: (2026)
by: Toyin, Hawau Olamide, et al.
Published: (2026)
Enabling Doctor-Centric Medical AI with LLMs through Workflow-Aligned Tasks and Benchmarks
by: Xie, Wenya, et al.
Published: (2025)
by: Xie, Wenya, et al.
Published: (2025)
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
by: Lee, Andrew, et al.
Published: (2024)
by: Lee, Andrew, et al.
Published: (2024)
Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL
by: Wongkamjan, Wichayaporn, et al.
Published: (2025)
by: Wongkamjan, Wichayaporn, et al.
Published: (2025)
Do Text-to-Vis Benchmarks Test Real Use of Visualisations?
by: Nguyen, Hy, et al.
Published: (2024)
by: Nguyen, Hy, et al.
Published: (2024)
Care from a Cognitive Perspective
by: Hachey, Alyse C.
Published: (2012)
by: Hachey, Alyse C.
Published: (2012)
Code Review Without Borders: Evaluating Synthetic vs. Real Data for Review Recommendation
by: Cohen, Yogev, et al.
Published: (2025)
by: Cohen, Yogev, et al.
Published: (2025)
Enhancing LLM Medical Coding with Structured External Knowledge
by: Gan, Yidong, et al.
Published: (2026)
by: Gan, Yidong, et al.
Published: (2026)
Exploring Self-Identified Counseling Expertise in Online Support Forums
by: Lahnala, Allison, et al.
Published: (2021)
by: Lahnala, Allison, et al.
Published: (2021)
Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations
by: Tian, Yuan, et al.
Published: (2023)
by: Tian, Yuan, et al.
Published: (2023)
Deep Literature Survey Automation with an Iterative Workflow
by: Zhang, Hongbo, et al.
Published: (2025)
by: Zhang, Hongbo, et al.
Published: (2025)
AI-Resilient Interfaces
by: Glassman, Elena L., et al.
Published: (2024)
by: Glassman, Elena L., et al.
Published: (2024)
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
by: Xu, Wanghan, et al.
Published: (2025)
by: Xu, Wanghan, et al.
Published: (2025)
A Critical Evaluation of AI Feedback for Aligning Large Language Models
by: Sharma, Archit, et al.
Published: (2024)
by: Sharma, Archit, et al.
Published: (2024)
More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play
by: Wongkamjan, Wichayaporn, et al.
Published: (2024)
by: Wongkamjan, Wichayaporn, et al.
Published: (2024)
A Critical Review of the Need for Knowledge-Centric Evaluation of Quranic Recitation
by: Al-Kharusi, Mohammed Hilal, et al.
Published: (2025)
by: Al-Kharusi, Mohammed Hilal, et al.
Published: (2025)
TrustDataFilter:Leveraging Trusted Knowledge Base Data for More Effective Filtering of Unknown Information
by: Zhang, Jinghong, et al.
Published: (2025)
by: Zhang, Jinghong, et al.
Published: (2025)
FlowCompile: An Optimizing Compiler for Structured LLM Workflows
by: Li, Junyan, et al.
Published: (2026)
by: Li, Junyan, et al.
Published: (2026)
Aligning Language Models with Clinical Expertise: DPO for Heart Failure Nursing Documentation in Critical Care
by: Fan, Junyi, et al.
Published: (2025)
by: Fan, Junyi, et al.
Published: (2025)
Aligning Netlist to Source Code using SynAlign
by: Garg, Sakshi, et al.
Published: (2025)
by: Garg, Sakshi, et al.
Published: (2025)
Coding-Free and Privacy-Preserving Agentic Framework for Data-Driven Clinical Research
by: Kim, Taehun, et al.
Published: (2026)
by: Kim, Taehun, et al.
Published: (2026)
AI Coding Agents Need Better Compiler Remarks
by: Deo, Akash, et al.
Published: (2026)
by: Deo, Akash, et al.
Published: (2026)
Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models
by: Zhang, Yuzhe, et al.
Published: (2024)
by: Zhang, Yuzhe, et al.
Published: (2024)
DualAlign: Generating Clinically Grounded Synthetic Data
by: Li, Rumeng, et al.
Published: (2025)
by: Li, Rumeng, et al.
Published: (2025)
Chord Embeddings: Analyzing What They Capture and Their Role for Next Chord Prediction and Artist Attribute Prediction
by: Lahnala, Allison, et al.
Published: (2021)
by: Lahnala, Allison, et al.
Published: (2021)
Reasoning on Multiple Needles In A Haystack
by: Wang, Yidong
Published: (2025)
by: Wang, Yidong
Published: (2025)
Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents
by: Choubey, Prafulla Kumar, et al.
Published: (2025)
by: Choubey, Prafulla Kumar, et al.
Published: (2025)
Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering
by: Polak, Maciej P., et al.
Published: (2023)
by: Polak, Maciej P., et al.
Published: (2023)
Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases
by: Xu, Shanshan, et al.
Published: (2025)
by: Xu, Shanshan, et al.
Published: (2025)
Transformer-Based Extraction of Statutory Definitions from the U.S. Code
by: Hosabettu, Arpana, et al.
Published: (2025)
by: Hosabettu, Arpana, et al.
Published: (2025)
NodeSynth: Socially Aligned Synthetic Data for AI Evaluation
by: Rashid, Qazi Mamunur, et al.
Published: (2026)
by: Rashid, Qazi Mamunur, et al.
Published: (2026)
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
by: Liang, Hao, et al.
Published: (2025)
by: Liang, Hao, et al.
Published: (2025)
Evaluating and Aligning CodeLLMs on Human Preference
by: Yang, Jian, et al.
Published: (2024)
by: Yang, Jian, et al.
Published: (2024)
Similar Items
-
Simple and Effective Baselines for Code Summarisation Evaluation
by: Robinson, Jade, et al.
Published: (2025) -
An Empirical Analysis of Static Analysis Methods for Detection and Mitigation of Code Library Hallucinations
by: Miranda-Pena, Clarissa, et al.
Published: (2026) -
MultiADE: A Multi-domain Benchmark for Adverse Drug Event Extraction
by: Dai, Xiang, et al.
Published: (2024) -
Using a Human-AI Teaming Approach to Create and Curate Scientific Datasets with the SCILIRE System
by: Bölücü, Necva, et al.
Published: (2026) -
Your Students Don't Use LLMs Like You Wish They Did
by: Kobler, Sebastian, et al.
Published: (2026)