:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gao, Joshua, Pham, Quoc Huy, Varghese, Subin, Saurav, Silwal, Hoskere, Vedhus
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2511.04502
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

View-Invariant Pixelwise Anomaly Detection in Multi-object Scenes with Adaptive View Synthesis
by: Varghese, Subin, et al.
Published: (2024)

BridgeEQA: Virtual Embodied Agents for Real Bridge Inspections
by: Varghese, Subin, et al.
Published: (2025)

ViewDelta: Scaling Scene Change Detection through Text-Conditioning
by: Varghese, Subin, et al.
Published: (2024)

Multiclass Post-Earthquake Building Assessment Integrating High-Resolution Optical and SAR Satellite Imagery, Ground Motion, and Soil Data with Transformers
by: Singh, Deepank, et al.
Published: (2024)

RAVine: Reality-Aligned Evaluation for Agentic Search
by: Xu, Yilong, et al.
Published: (2025)

Domain-Specific Data Generation Framework for RAG Adaptation
by: Tian, Chris Xing, et al.
Published: (2025)

RAFT: Adapting Language Model to Domain Specific RAG
by: Zhang, Tianjun, et al.
Published: (2024)

GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation
by: Xiao, Yilin, et al.
Published: (2025)

Transparent Reference-free Automated Evaluation of Open-Ended User Survey Responses
by: An, Subin, et al.
Published: (2025)

Towards Efficient Large Language Models for Scientific Text: A Review
by: To, Huy Quoc, et al.
Published: (2024)

Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair
by: Genova, Joshua, et al.
Published: (2024)

From RAG to Agentic RAG for Faithful Islamic Question Answering
by: Bhatia, Gagan, et al.
Published: (2026)

Agentic Adversarial QA for Improving Domain-Specific LLMs
by: Grari, Vincent, et al.
Published: (2026)

Chain-of-Rank: Enhancing Large Language Models for Domain-Specific RAG in Edge Device
by: Lee, Juntae, et al.
Published: (2025)

OASES: Outcome-Aligned Search-Evaluation Co-Training for Agentic Search
by: Zhang, Erhan, et al.
Published: (2026)

Towards AI Evaluation in Domain-Specific RAG Systems: The AgriHubi Case Study
by: Hasan, Md. Toufique, et al.
Published: (2026)

Excite, Attend and Segment (EASe): Domain-Agnostic Fine-Grained Mask Discovery with Feature Calibration and Self-Supervised Upsampling
by: Singh, Deepank, et al.
Published: (2026)

Automated Benchmark Generation from Domain Guidelines Informed by Bloom's Taxonomy
by: Chen, Si, et al.
Published: (2026)

LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation
by: Yang, Gao, et al.
Published: (2025)

Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs
by: Li, Yangning, et al.
Published: (2025)

Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents
by: Yehudai, Asaf, et al.
Published: (2026)

Evaluating ChatGPT on Nuclear Domain-Specific Data
by: Anwar, Muhammad, et al.
Published: (2024)

Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization
by: Agarwal, Anmol, et al.
Published: (2026)

DO-RAG: A Domain-Specific QA Framework Using Knowledge Graph-Enhanced Retrieval-Augmented Generation
by: Opoku, David Osei, et al.
Published: (2025)

Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
by: Singh, Aditi, et al.
Published: (2025)

CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG
by: Lee, Nayeon, et al.
Published: (2026)

JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG
by: Chen, Yiqun, et al.
Published: (2026)

FormalAlign: Automated Alignment Evaluation for Autoformalization
by: Lu, Jianqiao, et al.
Published: (2024)

LalaEval: A Holistic Human Evaluation Framework for Domain-Specific Large Language Models
by: Sun, Chongyan, et al.
Published: (2024)

Instance Segmentation of Reinforced Concrete Bridges with Synthetic Point Clouds
by: Rahman, Asad Ur, et al.
Published: (2024)

ASTRID -- An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
by: Chowdhury, Mohita, et al.
Published: (2025)

Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
by: Lin, Junhong, et al.
Published: (2025)

Reinforcement Learning for Optimizing RAG for Domain Chatbots
by: Kulkarni, Mandar, et al.
Published: (2024)

RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering
by: Han, Rujun, et al.
Published: (2024)

Anomaly Detection in Human Language via Meta-Learning: A Few-Shot Approach
by: Singla, Saurav, et al.
Published: (2025)

From Guidelines to Guarantees: A Graph-Based Evaluation Harness for Domain-Specific Evaluation of LLMs
by: Lundin, Jessica M., et al.
Published: (2025)

Toward Subtrait-Level Model Explainability in Automated Writing Evaluation
by: Andrade-Lotero, Alejandro, et al.
Published: (2025)

Evaluating Causal Explanation in Medical Reports with LLM-Based and Human-Aligned Metrics
by: Cho, Yousang, et al.
Published: (2025)

SLMEval: Entropy-Based Calibration for Human-Aligned Evaluation of Large Language Models
by: Daynauth, Roland, et al.
Published: (2025)

BoRP: Bootstrapped Regression Probing for Scalable and Human-Aligned LLM Evaluation
by: Sun, Peng, et al.
Published: (2026)