Saved in:
| Main Authors: | Gao, Joshua, Pham, Quoc Huy, Varghese, Subin, Saurav, Silwal, Hoskere, Vedhus |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.04502 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
View-Invariant Pixelwise Anomaly Detection in Multi-object Scenes with Adaptive View Synthesis
by: Varghese, Subin, et al.
Published: (2024)
by: Varghese, Subin, et al.
Published: (2024)
BridgeEQA: Virtual Embodied Agents for Real Bridge Inspections
by: Varghese, Subin, et al.
Published: (2025)
by: Varghese, Subin, et al.
Published: (2025)
ViewDelta: Scaling Scene Change Detection through Text-Conditioning
by: Varghese, Subin, et al.
Published: (2024)
by: Varghese, Subin, et al.
Published: (2024)
Multiclass Post-Earthquake Building Assessment Integrating High-Resolution Optical and SAR Satellite Imagery, Ground Motion, and Soil Data with Transformers
by: Singh, Deepank, et al.
Published: (2024)
by: Singh, Deepank, et al.
Published: (2024)
RAVine: Reality-Aligned Evaluation for Agentic Search
by: Xu, Yilong, et al.
Published: (2025)
by: Xu, Yilong, et al.
Published: (2025)
Domain-Specific Data Generation Framework for RAG Adaptation
by: Tian, Chris Xing, et al.
Published: (2025)
by: Tian, Chris Xing, et al.
Published: (2025)
RAFT: Adapting Language Model to Domain Specific RAG
by: Zhang, Tianjun, et al.
Published: (2024)
by: Zhang, Tianjun, et al.
Published: (2024)
GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation
by: Xiao, Yilin, et al.
Published: (2025)
by: Xiao, Yilin, et al.
Published: (2025)
Transparent Reference-free Automated Evaluation of Open-Ended User Survey Responses
by: An, Subin, et al.
Published: (2025)
by: An, Subin, et al.
Published: (2025)
Towards Efficient Large Language Models for Scientific Text: A Review
by: To, Huy Quoc, et al.
Published: (2024)
by: To, Huy Quoc, et al.
Published: (2024)
Vision-Based Adaptive Robotics for Autonomous Surface Crack Repair
by: Genova, Joshua, et al.
Published: (2024)
by: Genova, Joshua, et al.
Published: (2024)
From RAG to Agentic RAG for Faithful Islamic Question Answering
by: Bhatia, Gagan, et al.
Published: (2026)
by: Bhatia, Gagan, et al.
Published: (2026)
Agentic Adversarial QA for Improving Domain-Specific LLMs
by: Grari, Vincent, et al.
Published: (2026)
by: Grari, Vincent, et al.
Published: (2026)
Chain-of-Rank: Enhancing Large Language Models for Domain-Specific RAG in Edge Device
by: Lee, Juntae, et al.
Published: (2025)
by: Lee, Juntae, et al.
Published: (2025)
OASES: Outcome-Aligned Search-Evaluation Co-Training for Agentic Search
by: Zhang, Erhan, et al.
Published: (2026)
by: Zhang, Erhan, et al.
Published: (2026)
Towards AI Evaluation in Domain-Specific RAG Systems: The AgriHubi Case Study
by: Hasan, Md. Toufique, et al.
Published: (2026)
by: Hasan, Md. Toufique, et al.
Published: (2026)
Excite, Attend and Segment (EASe): Domain-Agnostic Fine-Grained Mask Discovery with Feature Calibration and Self-Supervised Upsampling
by: Singh, Deepank, et al.
Published: (2026)
by: Singh, Deepank, et al.
Published: (2026)
Automated Benchmark Generation from Domain Guidelines Informed by Bloom's Taxonomy
by: Chen, Si, et al.
Published: (2026)
by: Chen, Si, et al.
Published: (2026)
LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation
by: Yang, Gao, et al.
Published: (2025)
by: Yang, Gao, et al.
Published: (2025)
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs
by: Li, Yangning, et al.
Published: (2025)
by: Li, Yangning, et al.
Published: (2025)
Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents
by: Yehudai, Asaf, et al.
Published: (2026)
by: Yehudai, Asaf, et al.
Published: (2026)
Evaluating ChatGPT on Nuclear Domain-Specific Data
by: Anwar, Muhammad, et al.
Published: (2024)
by: Anwar, Muhammad, et al.
Published: (2024)
Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization
by: Agarwal, Anmol, et al.
Published: (2026)
by: Agarwal, Anmol, et al.
Published: (2026)
DO-RAG: A Domain-Specific QA Framework Using Knowledge Graph-Enhanced Retrieval-Augmented Generation
by: Opoku, David Osei, et al.
Published: (2025)
by: Opoku, David Osei, et al.
Published: (2025)
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG
by: Singh, Aditi, et al.
Published: (2025)
by: Singh, Aditi, et al.
Published: (2025)
CORAL: Adaptive Retrieval Loop for Culturally-Aligned Multilingual RAG
by: Lee, Nayeon, et al.
Published: (2026)
by: Lee, Nayeon, et al.
Published: (2026)
JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG
by: Chen, Yiqun, et al.
Published: (2026)
by: Chen, Yiqun, et al.
Published: (2026)
FormalAlign: Automated Alignment Evaluation for Autoformalization
by: Lu, Jianqiao, et al.
Published: (2024)
by: Lu, Jianqiao, et al.
Published: (2024)
LalaEval: A Holistic Human Evaluation Framework for Domain-Specific Large Language Models
by: Sun, Chongyan, et al.
Published: (2024)
by: Sun, Chongyan, et al.
Published: (2024)
Instance Segmentation of Reinforced Concrete Bridges with Synthetic Point Clouds
by: Rahman, Asad Ur, et al.
Published: (2024)
by: Rahman, Asad Ur, et al.
Published: (2024)
ASTRID -- An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
by: Chowdhury, Mohita, et al.
Published: (2025)
by: Chowdhury, Mohita, et al.
Published: (2025)
Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning
by: Lin, Junhong, et al.
Published: (2025)
by: Lin, Junhong, et al.
Published: (2025)
Reinforcement Learning for Optimizing RAG for Domain Chatbots
by: Kulkarni, Mandar, et al.
Published: (2024)
by: Kulkarni, Mandar, et al.
Published: (2024)
RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering
by: Han, Rujun, et al.
Published: (2024)
by: Han, Rujun, et al.
Published: (2024)
Anomaly Detection in Human Language via Meta-Learning: A Few-Shot Approach
by: Singla, Saurav, et al.
Published: (2025)
by: Singla, Saurav, et al.
Published: (2025)
From Guidelines to Guarantees: A Graph-Based Evaluation Harness for Domain-Specific Evaluation of LLMs
by: Lundin, Jessica M., et al.
Published: (2025)
by: Lundin, Jessica M., et al.
Published: (2025)
Toward Subtrait-Level Model Explainability in Automated Writing Evaluation
by: Andrade-Lotero, Alejandro, et al.
Published: (2025)
by: Andrade-Lotero, Alejandro, et al.
Published: (2025)
Evaluating Causal Explanation in Medical Reports with LLM-Based and Human-Aligned Metrics
by: Cho, Yousang, et al.
Published: (2025)
by: Cho, Yousang, et al.
Published: (2025)
SLMEval: Entropy-Based Calibration for Human-Aligned Evaluation of Large Language Models
by: Daynauth, Roland, et al.
Published: (2025)
by: Daynauth, Roland, et al.
Published: (2025)
BoRP: Bootstrapped Regression Probing for Scalable and Human-Aligned LLM Evaluation
by: Sun, Peng, et al.
Published: (2026)
by: Sun, Peng, et al.
Published: (2026)
Similar Items
-
View-Invariant Pixelwise Anomaly Detection in Multi-object Scenes with Adaptive View Synthesis
by: Varghese, Subin, et al.
Published: (2024) -
BridgeEQA: Virtual Embodied Agents for Real Bridge Inspections
by: Varghese, Subin, et al.
Published: (2025) -
ViewDelta: Scaling Scene Change Detection through Text-Conditioning
by: Varghese, Subin, et al.
Published: (2024) -
Multiclass Post-Earthquake Building Assessment Integrating High-Resolution Optical and SAR Satellite Imagery, Ground Motion, and Soil Data with Transformers
by: Singh, Deepank, et al.
Published: (2024) -
RAVine: Reality-Aligned Evaluation for Agentic Search
by: Xu, Yilong, et al.
Published: (2025)