Saved in:
| Main Authors: | Pichlmair, Martin, Raj, Riddhi, Putney, Charlene |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.11574 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference
by: Thorne, William, et al.
Published: (2024)
by: Thorne, William, et al.
Published: (2024)
Project Synapse: A Hierarchical Multi-Agent Framework with Hybrid Memory for Autonomous Resolution of Last-Mile Delivery Disruptions
by: Yadav, Arin Gopalan, et al.
Published: (2026)
by: Yadav, Arin Gopalan, et al.
Published: (2026)
A Survey on Collaborating Small and Large Language Models for Performance, Cost-effectiveness, Cloud-edge Privacy, and Trustworthiness
by: Wang, Fali, et al.
Published: (2025)
by: Wang, Fali, et al.
Published: (2025)
Do LLMs have a Gender (Entropy) Bias?
by: Prabhune, Sonal, et al.
Published: (2025)
by: Prabhune, Sonal, et al.
Published: (2025)
A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing
by: Nourmohammadi, Naeimeh, et al.
Published: (2026)
by: Nourmohammadi, Naeimeh, et al.
Published: (2026)
Data and AI governance: Promoting equity, ethics, and fairness in large language models
by: Abhishek, Alok, et al.
Published: (2025)
by: Abhishek, Alok, et al.
Published: (2025)
Do Reasoning Models Enhance Embedding Models?
by: Chan, Wun Yu, et al.
Published: (2026)
by: Chan, Wun Yu, et al.
Published: (2026)
SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models
by: Abhishek, Alok, et al.
Published: (2026)
by: Abhishek, Alok, et al.
Published: (2026)
BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models
by: Abhishek, Alok, et al.
Published: (2025)
by: Abhishek, Alok, et al.
Published: (2025)
Reasoning Promotes Robustness in Theory of Mind Tasks
by: de Haan, Ian B., et al.
Published: (2026)
by: de Haan, Ian B., et al.
Published: (2026)
Make Literature-Based Discovery Great Again through Reproducible Pipelines
by: Cestnik, Bojan, et al.
Published: (2025)
by: Cestnik, Bojan, et al.
Published: (2025)
Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
by: Laborde, Stanislas, et al.
Published: (2025)
by: Laborde, Stanislas, et al.
Published: (2025)
Symphonym: Universal Phonetic Embeddings for Cross-Script Name Matching
by: Gadd, Stephen
Published: (2026)
by: Gadd, Stephen
Published: (2026)
Communicative Agents for Slideshow Storytelling Video Generation based on LLMs
by: Fan, Jingxing, et al.
Published: (2025)
by: Fan, Jingxing, et al.
Published: (2025)
Supporting software engineering tasks with agentic AI: Demonstration on document retrieval and test scenario generation
by: Kica, Marian, et al.
Published: (2026)
by: Kica, Marian, et al.
Published: (2026)
Tool Receipts, Not Zero-Knowledge Proofs: Practical Hallucination Detection for AI Agents
by: Basu, Abhinaba
Published: (2026)
by: Basu, Abhinaba
Published: (2026)
Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization
by: Bronec, Jan, et al.
Published: (2025)
by: Bronec, Jan, et al.
Published: (2025)
A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness
by: Wang, Fali, et al.
Published: (2024)
by: Wang, Fali, et al.
Published: (2024)
Large Language Models are Inconsistent and Biased Evaluators
by: Stureborg, Rickard, et al.
Published: (2024)
by: Stureborg, Rickard, et al.
Published: (2024)
Uncovering Uncertainty in Transformer Inference
by: Brothers, Greyson, et al.
Published: (2024)
by: Brothers, Greyson, et al.
Published: (2024)
InhibiDistilbert: Knowledge Distillation for a ReLU and Addition-based Transformer
by: Zhang, Tony, et al.
Published: (2025)
by: Zhang, Tony, et al.
Published: (2025)
Recent Advances and Future Directions in Literature-Based Discovery
by: Kastrin, Andrej, et al.
Published: (2025)
by: Kastrin, Andrej, et al.
Published: (2025)
What is Wrong with Language Models that Can Not Tell a Story?
by: Yamshchikov, Ivan P., et al.
Published: (2022)
by: Yamshchikov, Ivan P., et al.
Published: (2022)
AURA: Agent for Understanding, Reasoning, and Automated Tool Use in Voice-Driven Tasks
by: Maben, Leander Melroy, et al.
Published: (2025)
by: Maben, Leander Melroy, et al.
Published: (2025)
It's 2025 -- Narrative Learning is the new baseline to beat for explainable machine learning
by: Baker, Gregory D.
Published: (2025)
by: Baker, Gregory D.
Published: (2025)
Semi-automated extraction of research topics and trends from NCI funding in radiological sciences from 2000-2020
by: Nguyen, Mark, et al.
Published: (2023)
by: Nguyen, Mark, et al.
Published: (2023)
Quantum NLP models on Natural Language Inference
by: Sun, Ling, et al.
Published: (2025)
by: Sun, Ling, et al.
Published: (2025)
A Collaborative Content Moderation Framework for Toxicity Detection based on Conformalized Estimates of Annotation Disagreement
by: Villate-Castillo, Guillermo, et al.
Published: (2024)
by: Villate-Castillo, Guillermo, et al.
Published: (2024)
From Black Box to Glass Box: Cross-Model ASR Disagreement to Prioto Review in Ambient AI Scribe Documentation
by: Karbalaie, Abdolamir, et al.
Published: (2026)
by: Karbalaie, Abdolamir, et al.
Published: (2026)
NeuroState-Bench: A Human-Calibrated Benchmark for Commitment Integrity in LLM Agent Profiles
by: Jia, Xiao
Published: (2026)
by: Jia, Xiao
Published: (2026)
Approaches to Semantic Textual Similarity in Slovak Language: From Algorithms to Transformers
by: Radosky, Lukas, et al.
Published: (2026)
by: Radosky, Lukas, et al.
Published: (2026)
MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering
by: Yim, Wen-wai, et al.
Published: (2025)
by: Yim, Wen-wai, et al.
Published: (2025)
DanceHA: A Multi-Agent Framework for Document-Level Aspect-Based Sentiment Analysis
by: Wang, Lei, et al.
Published: (2026)
by: Wang, Lei, et al.
Published: (2026)
Tailoring Vaccine Messaging with Common-Ground Opinions
by: Stureborg, Rickard, et al.
Published: (2024)
by: Stureborg, Rickard, et al.
Published: (2024)
Generative AI Models: Opportunities and Risks for Industry and Authorities
by: Alt, Tobias, et al.
Published: (2024)
by: Alt, Tobias, et al.
Published: (2024)
Judgment2vec: Apply Graph Analytics to Searching and Recommendation of Similar Judgments
by: Shao, Hsuan-Lei
Published: (2024)
by: Shao, Hsuan-Lei
Published: (2024)
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
by: Wang, Huanqian, et al.
Published: (2024)
by: Wang, Huanqian, et al.
Published: (2024)
CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density
by: Kaiser, Daniel, et al.
Published: (2025)
by: Kaiser, Daniel, et al.
Published: (2025)
RACAS: Controlling Diverse Robots With a Single Agentic System
by: Ashley, Dylan R., et al.
Published: (2026)
by: Ashley, Dylan R., et al.
Published: (2026)
A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders
by: Wang, Yizheng, et al.
Published: (2025)
by: Wang, Yizheng, et al.
Published: (2025)
Similar Items
-
Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference
by: Thorne, William, et al.
Published: (2024) -
Project Synapse: A Hierarchical Multi-Agent Framework with Hybrid Memory for Autonomous Resolution of Last-Mile Delivery Disruptions
by: Yadav, Arin Gopalan, et al.
Published: (2026) -
A Survey on Collaborating Small and Large Language Models for Performance, Cost-effectiveness, Cloud-edge Privacy, and Trustworthiness
by: Wang, Fali, et al.
Published: (2025) -
Do LLMs have a Gender (Entropy) Bias?
by: Prabhune, Sonal, et al.
Published: (2025) -
A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing
by: Nourmohammadi, Naeimeh, et al.
Published: (2026)