:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pichlmair, Martin, Raj, Riddhi, Putney, Charlene
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence Computation and Language 68T42 (Primary), 68T50 (Secondary) I.2.7; J.5
Online Access:	https://arxiv.org/abs/2408.11574
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference
by: Thorne, William, et al.
Published: (2024)

Project Synapse: A Hierarchical Multi-Agent Framework with Hybrid Memory for Autonomous Resolution of Last-Mile Delivery Disruptions
by: Yadav, Arin Gopalan, et al.
Published: (2026)

A Survey on Collaborating Small and Large Language Models for Performance, Cost-effectiveness, Cloud-edge Privacy, and Trustworthiness
by: Wang, Fali, et al.
Published: (2025)

Do LLMs have a Gender (Entropy) Bias?
by: Prabhune, Sonal, et al.
Published: (2025)

A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing
by: Nourmohammadi, Naeimeh, et al.
Published: (2026)

Data and AI governance: Promoting equity, ethics, and fairness in large language models
by: Abhishek, Alok, et al.
Published: (2025)

Do Reasoning Models Enhance Embedding Models?
by: Chan, Wun Yu, et al.
Published: (2026)

SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models
by: Abhishek, Alok, et al.
Published: (2026)

BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models
by: Abhishek, Alok, et al.
Published: (2025)

Reasoning Promotes Robustness in Theory of Mind Tasks
by: de Haan, Ian B., et al.
Published: (2026)

Make Literature-Based Discovery Great Again through Reproducible Pipelines
by: Cestnik, Bojan, et al.
Published: (2025)

Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
by: Laborde, Stanislas, et al.
Published: (2025)

Symphonym: Universal Phonetic Embeddings for Cross-Script Name Matching
by: Gadd, Stephen
Published: (2026)

Communicative Agents for Slideshow Storytelling Video Generation based on LLMs
by: Fan, Jingxing, et al.
Published: (2025)

Supporting software engineering tasks with agentic AI: Demonstration on document retrieval and test scenario generation
by: Kica, Marian, et al.
Published: (2026)

Tool Receipts, Not Zero-Knowledge Proofs: Practical Hallucination Detection for AI Agents
by: Basu, Abhinaba
Published: (2026)

Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization
by: Bronec, Jan, et al.
Published: (2025)

A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness
by: Wang, Fali, et al.
Published: (2024)

Large Language Models are Inconsistent and Biased Evaluators
by: Stureborg, Rickard, et al.
Published: (2024)

Uncovering Uncertainty in Transformer Inference
by: Brothers, Greyson, et al.
Published: (2024)

InhibiDistilbert: Knowledge Distillation for a ReLU and Addition-based Transformer
by: Zhang, Tony, et al.
Published: (2025)

Recent Advances and Future Directions in Literature-Based Discovery
by: Kastrin, Andrej, et al.
Published: (2025)

What is Wrong with Language Models that Can Not Tell a Story?
by: Yamshchikov, Ivan P., et al.
Published: (2022)

AURA: Agent for Understanding, Reasoning, and Automated Tool Use in Voice-Driven Tasks
by: Maben, Leander Melroy, et al.
Published: (2025)

It's 2025 -- Narrative Learning is the new baseline to beat for explainable machine learning
by: Baker, Gregory D.
Published: (2025)

Semi-automated extraction of research topics and trends from NCI funding in radiological sciences from 2000-2020
by: Nguyen, Mark, et al.
Published: (2023)

Quantum NLP models on Natural Language Inference
by: Sun, Ling, et al.
Published: (2025)

A Collaborative Content Moderation Framework for Toxicity Detection based on Conformalized Estimates of Annotation Disagreement
by: Villate-Castillo, Guillermo, et al.
Published: (2024)

From Black Box to Glass Box: Cross-Model ASR Disagreement to Prioto Review in Ambient AI Scribe Documentation
by: Karbalaie, Abdolamir, et al.
Published: (2026)

NeuroState-Bench: A Human-Calibrated Benchmark for Commitment Integrity in LLM Agent Profiles
by: Jia, Xiao
Published: (2026)

Approaches to Semantic Textual Similarity in Slovak Language: From Algorithms to Transformers
by: Radosky, Lukas, et al.
Published: (2026)

MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering
by: Yim, Wen-wai, et al.
Published: (2025)

DanceHA: A Multi-Agent Framework for Document-Level Aspect-Based Sentiment Analysis
by: Wang, Lei, et al.
Published: (2026)

Tailoring Vaccine Messaging with Common-Ground Opinions
by: Stureborg, Rickard, et al.
Published: (2024)

Generative AI Models: Opportunities and Risks for Industry and Authorities
by: Alt, Tobias, et al.
Published: (2024)

Judgment2vec: Apply Graph Analytics to Searching and Recommendation of Similar Judgments
by: Shao, Hsuan-Lei
Published: (2024)

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
by: Wang, Huanqian, et al.
Published: (2024)

CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density
by: Kaiser, Daniel, et al.
Published: (2025)

RACAS: Controlling Diverse Robots With a Single Agentic System
by: Ashley, Dylan R., et al.
Published: (2026)

A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders
by: Wang, Yizheng, et al.
Published: (2025)