:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Veerbeek, Joris, Diakopoulos, Nicholas
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2409.07286
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Simulating Policy Impacts: Developing a Generative Scenario Writing Method to Evaluate the Perceived Effects of Regulation
by: Barnett, Julia, et al.
Published: (2024)

Evaluating the Capabilities of LLMs for Supporting Anticipatory Impact Assessment
by: Allaham, Mowafak, et al.
Published: (2024)

Clinically Grounded Agent-based Report Evaluation: An Interpretable Metric for Radiology Report Generation
by: Dua, Radhika, et al.
Published: (2025)

Clarify, Abstain or Answer? Strategising in Conversation with Belief-Augmented Generation
by: Baan, Joris, et al.
Published: (2026)

Generating Individual Travel Diaries Using Large Language Models Informed by Census and Land-Use Data
by: Amin, Sepehr Golrokh, et al.
Published: (2025)

Agent Instructs Large Language Models to be General Zero-Shot Reasoners
by: Crispino, Nicholas, et al.
Published: (2023)

DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
by: Khan, Zaid, et al.
Published: (2024)

Eliciting Language Model Behaviors with Investigator Agents
by: Li, Xiang Lisa, et al.
Published: (2025)

CALICO: Conversational Agent Localization via Synthetic Data Generation
by: Rosenbaum, Andy, et al.
Published: (2024)

Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
by: Wang, Dong, et al.
Published: (2025)

Towards Leveraging News Media to Support Impact Assessment of AI Technologies
by: Allaham, Mowafak, et al.
Published: (2024)

Not Wrong, But Untrue: LLM Overconfidence in Document-Based Queries
by: Hagar, Nick, et al.
Published: (2025)

SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models
by: Kapadnis, Manav Nitin, et al.
Published: (2024)

Encoding Agent Trajectories as Representations with Sequence Transformers
by: Tsiligkaridis, Athanasios, et al.
Published: (2024)

Investigating Data Contamination for Pre-training Language Models
by: Jiang, Minhao, et al.
Published: (2024)

AgentRM: Enhancing Agent Generalization with Reward Modeling
by: Xia, Yu, et al.
Published: (2025)

DataSciBench: An LLM Agent Benchmark for Data Science
by: Zhang, Dan, et al.
Published: (2025)

Creating emoji lexica from unsupervised sentiment analysis of their descriptions
by: Fernández-Gavilanes, Milagros, et al.
Published: (2024)

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
by: Prabhakar, Akshara, et al.
Published: (2025)

CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
by: Yuan, Lifan, et al.
Published: (2023)

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
by: Zhang, Jianguo, et al.
Published: (2024)

On Creating an English-Thai Code-switched Machine Translation in Medical Domain
by: Pengpun, Parinthapat, et al.
Published: (2024)

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
by: Lù, Xing Han, et al.
Published: (2025)

Training a Generally Curious Agent
by: Tajwar, Fahim, et al.
Published: (2025)

Socially Aware Synthetic Data Generation for Suicidal Ideation Detection Using Large Language Models
by: Ghanadian, Hamideh, et al.
Published: (2024)

AgentKernelArena: Generalization-Aware Benchmarking of GPU Kernel Optimization Agents
by: Younesian, Sharareh, et al.
Published: (2026)

KwaiAgents: Generalized Information-seeking Agent System with Large Language Models
by: Pan, Haojie, et al.
Published: (2023)

ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
by: Chen, Ziru, et al.
Published: (2024)

BPO: Staying Close to the Behavior LLM Creates Better Online LLM Alignment
by: Xu, Wenda, et al.
Published: (2024)

Retrieval-augmented GUI Agents with Generative Guidelines
by: Xu, Ran, et al.
Published: (2025)

AgentInstruct: Toward Generative Teaching with Agentic Flows
by: Mitra, Arindam, et al.
Published: (2024)

DigiData: Training and Evaluating General-Purpose Mobile Control Agents
by: Sun, Yuxuan, et al.
Published: (2025)

Towards Automated Safety Requirements Derivation Using Agent-based RAG
by: Balu, Balahari Vignesh, et al.
Published: (2025)

Tucano: Advancing Neural Text Generation for Portuguese
by: Corrêa, Nicholas Kluge, et al.
Published: (2024)

SafeArena: Evaluating the Safety of Autonomous Web Agents
by: Tur, Ada Defne, et al.
Published: (2025)

Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks
by: Huang, Yuncheng, et al.
Published: (2024)

Investigating Bias: A Multilingual Pipeline for Generating, Solving, and Evaluating Math Problems with LLMs
by: Mahran, Mariam, et al.
Published: (2025)

Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data
by: Rallapalli, Swati, et al.
Published: (2025)

Anomaly Detection of Tabular Data Using LLMs
by: Li, Aodong, et al.
Published: (2024)

AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
by: Hu, Mengkang, et al.
Published: (2024)