Saved in:
| Main Authors: | Veerbeek, Joris, Diakopoulos, Nicholas |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.07286 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Simulating Policy Impacts: Developing a Generative Scenario Writing Method to Evaluate the Perceived Effects of Regulation
by: Barnett, Julia, et al.
Published: (2024)
by: Barnett, Julia, et al.
Published: (2024)
Evaluating the Capabilities of LLMs for Supporting Anticipatory Impact Assessment
by: Allaham, Mowafak, et al.
Published: (2024)
by: Allaham, Mowafak, et al.
Published: (2024)
Clinically Grounded Agent-based Report Evaluation: An Interpretable Metric for Radiology Report Generation
by: Dua, Radhika, et al.
Published: (2025)
by: Dua, Radhika, et al.
Published: (2025)
Clarify, Abstain or Answer? Strategising in Conversation with Belief-Augmented Generation
by: Baan, Joris, et al.
Published: (2026)
by: Baan, Joris, et al.
Published: (2026)
Generating Individual Travel Diaries Using Large Language Models Informed by Census and Land-Use Data
by: Amin, Sepehr Golrokh, et al.
Published: (2025)
by: Amin, Sepehr Golrokh, et al.
Published: (2025)
Agent Instructs Large Language Models to be General Zero-Shot Reasoners
by: Crispino, Nicholas, et al.
Published: (2023)
by: Crispino, Nicholas, et al.
Published: (2023)
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
by: Khan, Zaid, et al.
Published: (2024)
by: Khan, Zaid, et al.
Published: (2024)
Eliciting Language Model Behaviors with Investigator Agents
by: Li, Xiang Lisa, et al.
Published: (2025)
by: Li, Xiang Lisa, et al.
Published: (2025)
CALICO: Conversational Agent Localization via Synthetic Data Generation
by: Rosenbaum, Andy, et al.
Published: (2024)
by: Rosenbaum, Andy, et al.
Published: (2024)
Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework
by: Wang, Dong, et al.
Published: (2025)
by: Wang, Dong, et al.
Published: (2025)
Towards Leveraging News Media to Support Impact Assessment of AI Technologies
by: Allaham, Mowafak, et al.
Published: (2024)
by: Allaham, Mowafak, et al.
Published: (2024)
Not Wrong, But Untrue: LLM Overconfidence in Document-Based Queries
by: Hagar, Nick, et al.
Published: (2025)
by: Hagar, Nick, et al.
Published: (2025)
SERPENT-VLM : Self-Refining Radiology Report Generation Using Vision Language Models
by: Kapadnis, Manav Nitin, et al.
Published: (2024)
by: Kapadnis, Manav Nitin, et al.
Published: (2024)
Encoding Agent Trajectories as Representations with Sequence Transformers
by: Tsiligkaridis, Athanasios, et al.
Published: (2024)
by: Tsiligkaridis, Athanasios, et al.
Published: (2024)
Investigating Data Contamination for Pre-training Language Models
by: Jiang, Minhao, et al.
Published: (2024)
by: Jiang, Minhao, et al.
Published: (2024)
AgentRM: Enhancing Agent Generalization with Reward Modeling
by: Xia, Yu, et al.
Published: (2025)
by: Xia, Yu, et al.
Published: (2025)
DataSciBench: An LLM Agent Benchmark for Data Science
by: Zhang, Dan, et al.
Published: (2025)
by: Zhang, Dan, et al.
Published: (2025)
Creating emoji lexica from unsupervised sentiment analysis of their descriptions
by: Fernández-Gavilanes, Milagros, et al.
Published: (2024)
by: Fernández-Gavilanes, Milagros, et al.
Published: (2024)
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay
by: Prabhakar, Akshara, et al.
Published: (2025)
by: Prabhakar, Akshara, et al.
Published: (2025)
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets
by: Yuan, Lifan, et al.
Published: (2023)
by: Yuan, Lifan, et al.
Published: (2023)
AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning
by: Zhang, Jianguo, et al.
Published: (2024)
by: Zhang, Jianguo, et al.
Published: (2024)
On Creating an English-Thai Code-switched Machine Translation in Medical Domain
by: Pengpun, Parinthapat, et al.
Published: (2024)
by: Pengpun, Parinthapat, et al.
Published: (2024)
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
by: Lù, Xing Han, et al.
Published: (2025)
by: Lù, Xing Han, et al.
Published: (2025)
Training a Generally Curious Agent
by: Tajwar, Fahim, et al.
Published: (2025)
by: Tajwar, Fahim, et al.
Published: (2025)
Socially Aware Synthetic Data Generation for Suicidal Ideation Detection Using Large Language Models
by: Ghanadian, Hamideh, et al.
Published: (2024)
by: Ghanadian, Hamideh, et al.
Published: (2024)
AgentKernelArena: Generalization-Aware Benchmarking of GPU Kernel Optimization Agents
by: Younesian, Sharareh, et al.
Published: (2026)
by: Younesian, Sharareh, et al.
Published: (2026)
KwaiAgents: Generalized Information-seeking Agent System with Large Language Models
by: Pan, Haojie, et al.
Published: (2023)
by: Pan, Haojie, et al.
Published: (2023)
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
by: Chen, Ziru, et al.
Published: (2024)
by: Chen, Ziru, et al.
Published: (2024)
BPO: Staying Close to the Behavior LLM Creates Better Online LLM Alignment
by: Xu, Wenda, et al.
Published: (2024)
by: Xu, Wenda, et al.
Published: (2024)
Retrieval-augmented GUI Agents with Generative Guidelines
by: Xu, Ran, et al.
Published: (2025)
by: Xu, Ran, et al.
Published: (2025)
AgentInstruct: Toward Generative Teaching with Agentic Flows
by: Mitra, Arindam, et al.
Published: (2024)
by: Mitra, Arindam, et al.
Published: (2024)
DigiData: Training and Evaluating General-Purpose Mobile Control Agents
by: Sun, Yuxuan, et al.
Published: (2025)
by: Sun, Yuxuan, et al.
Published: (2025)
Towards Automated Safety Requirements Derivation Using Agent-based RAG
by: Balu, Balahari Vignesh, et al.
Published: (2025)
by: Balu, Balahari Vignesh, et al.
Published: (2025)
Tucano: Advancing Neural Text Generation for Portuguese
by: Corrêa, Nicholas Kluge, et al.
Published: (2024)
by: Corrêa, Nicholas Kluge, et al.
Published: (2024)
SafeArena: Evaluating the Safety of Autonomous Web Agents
by: Tur, Ada Defne, et al.
Published: (2025)
by: Tur, Ada Defne, et al.
Published: (2025)
Laying the Foundation First? Investigating the Generalization from Atomic Skills to Complex Reasoning Tasks
by: Huang, Yuncheng, et al.
Published: (2024)
by: Huang, Yuncheng, et al.
Published: (2024)
Investigating Bias: A Multilingual Pipeline for Generating, Solving, and Evaluating Math Problems with LLMs
by: Mahran, Mariam, et al.
Published: (2025)
by: Mahran, Mariam, et al.
Published: (2025)
Fine-Tuning LLMs for Report Summarization: Analysis on Supervised and Unsupervised Data
by: Rallapalli, Swati, et al.
Published: (2025)
by: Rallapalli, Swati, et al.
Published: (2025)
Anomaly Detection of Tabular Data Using LLMs
by: Li, Aodong, et al.
Published: (2024)
by: Li, Aodong, et al.
Published: (2024)
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
by: Hu, Mengkang, et al.
Published: (2024)
by: Hu, Mengkang, et al.
Published: (2024)
Similar Items
-
Simulating Policy Impacts: Developing a Generative Scenario Writing Method to Evaluate the Perceived Effects of Regulation
by: Barnett, Julia, et al.
Published: (2024) -
Evaluating the Capabilities of LLMs for Supporting Anticipatory Impact Assessment
by: Allaham, Mowafak, et al.
Published: (2024) -
Clinically Grounded Agent-based Report Evaluation: An Interpretable Metric for Radiology Report Generation
by: Dua, Radhika, et al.
Published: (2025) -
Clarify, Abstain or Answer? Strategising in Conversation with Belief-Augmented Generation
by: Baan, Joris, et al.
Published: (2026) -
Generating Individual Travel Diaries Using Large Language Models Informed by Census and Land-Use Data
by: Amin, Sepehr Golrokh, et al.
Published: (2025)