:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Castro-Gonzalez, Leonardo, Chung, Yi-Ling, Kirk, Hannak Rose, Francis, John, Williams, Angus R., Johansson, Pica, Bright, Jonathan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language I.2.7; J.4
Online Access:	https://arxiv.org/abs/2401.12295
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FathomGPT: A Natural Language Interface for Interactively Exploring Ocean Science Data
by: Khanal, Nabin, et al.
Published: (2024)

Disclosure By Design: Identity Transparency as a Behavioural Property of Conversational AI Models
by: Gausen, Anna, et al.
Published: (2026)

Talk is Cheap, Communication is Hard: Dynamic Grounding Failures and Repair in Multi-Agent Negotiation
by: Yao, Yiheng, et al.
Published: (2026)

Socially Responsible Data for Large Multilingual Language Models
by: Smart, Andrew, et al.
Published: (2024)

Only Whats Necessary: Pareto Optimal Data Minimization for Privacy Preserving Video Anomaly Detection
by: Aslam, Nazia, et al.
Published: (2026)

CSSDH: An Ontology for Social Determinants of Health to Operational Continuity of Care Data Interoperability
by: Das, Subhashis, et al.
Published: (2024)

Synthia: Scalable Grounded Persona Generation from Social Media Data
by: Rahimzadeh, Vahid, et al.
Published: (2025)

Applying Cognitive Design Patterns to General LLM Agents
by: Wray, Robert E., et al.
Published: (2025)

A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19
by: Khandelwal, Vedant, et al.
Published: (2024)

Improving the OOD Performance of Closed-Source LLMs on NLI Through Strategic Data Selection
by: Stacey, Joe, et al.
Published: (2025)

CSSDM Ontology to Enable Continuity of Care Data Interoperability
by: Das, Subhashis, et al.
Published: (2025)

Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes
by: Jayawardhana, Mayuka, et al.
Published: (2025)

Transparent but Powerful: Explainability, Accuracy, and Generalizability in ADHD Detection from Social Media Data
by: Wiechmann, D., et al.
Published: (2024)

Survey Transfer Learning: Recycling Data with Silicon Responses
by: Amini, Ali
Published: (2025)

Data Analytics for Improving Energy Efficiency in Short Sea Shipping
by: Abuella, Mohamed, et al.
Published: (2024)

Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents
by: Kim, San, et al.
Published: (2024)

Random Heterogeneous Neurochaos Learning Architecture for Data Classification
by: S, Remya Ajai A, et al.
Published: (2024)

Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
by: Zhou, Yue, et al.
Published: (2025)

ChatGPT4PCG 2 Competition: Prompt Engineering for Science Birds Level Generation
by: Taveekitworachai, Pittawat, et al.
Published: (2024)

Eliciting Problem Specifications via Large Language Models
by: Wray, Robert E., et al.
Published: (2024)

BenCSSmark: Making the Social Sciences Count in LLM Research
by: Chatelain, Arnault, et al.
Published: (2026)

XferBench: a Data-Driven Benchmark for Emergent Language
by: Boldt, Brendon, et al.
Published: (2024)

Comparing the Performance of LLMs in RAG-based Question-Answering: A Case Study in Computer Science Literature
by: Dayarathne, Ranul, et al.
Published: (2025)

SQLord: A Robust Enterprise Text-to-SQL Solution via Reverse Data Generation and Workflow Decomposition
by: Cheng, Song, et al.
Published: (2025)

SemanticAgent: A Semantics-Aware Framework for Text-to-SQL Data Synthesis
by: Gao, Qiang, et al.
Published: (2026)

Let Your Graph Do the Talking: Encoding Structured Data for LLMs
by: Perozzi, Bryan, et al.
Published: (2024)

TRiMS: Real-Time Tracking of Minimal Sufficient Length for Efficient Reasoning via RL
by: Bian, Tingcheng, et al.
Published: (2026)

KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models
by: Liu, Zhongxin, et al.
Published: (2025)

Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
by: Tong, Jingqi, et al.
Published: (2025)

propella-1: Multi-Property Document Annotation for LLM Data Curation at Scale
by: Idahl, Maximilian, et al.
Published: (2026)

Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
by: Bouthors, Maxime, et al.
Published: (2025)

Synthetic Voice Data for Automatic Speech Recognition in African Languages
by: DeRenzi, Brian, et al.
Published: (2025)

Curating Grounded Synthetic Data with Global Perspectives for Equitable AI
by: Törnquist, Elin, et al.
Published: (2024)

PuzzleClone: A DSL-Powered Framework for Synthesizing Verifiable Data
by: Xiong, Kai, et al.
Published: (2025)

EvilGenie: A Reward Hacking Benchmark
by: Gabor, Jonathan, et al.
Published: (2025)

"The Data Says Otherwise"-Towards Automated Fact-checking and Communication of Data Claims
by: Fu, Yu, et al.
Published: (2024)

Listwise Direct Preference Optimization with Multi-Dimensional Preference Mixing
by: Sun, Yuhui, et al.
Published: (2025)

Select or Project? Evaluating Lower-dimensional Vectors for LLM Training Data Explanations
by: Hinterleitner, Lukas, et al.
Published: (2026)

SLAP: Stratified Loss-based Pruning for On-Policy Data-Efficient Instruction Tuning
by: Zou, Run, et al.
Published: (2026)

Vocabulary Transfer for Biomedical Texts: Add Tokens if You Can Not Add Data
by: Singh, Priyanka, et al.
Published: (2022)