Saved in:
| Main Authors: | Abdaljalil, Samir, Pallucchini, Filippo, Seveso, Andrea, Kurban, Hasan, Mercorio, Fabio, Serpedin, Erchin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.03032 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
by: Abdaljalil, Samir, et al.
Published: (2025)
by: Abdaljalil, Samir, et al.
Published: (2025)
Knowing When Not to Answer: Abstention-Aware Scientific Reasoning
by: Abdaljalil, Samir, et al.
Published: (2026)
by: Abdaljalil, Samir, et al.
Published: (2026)
SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs
by: Abdaljalil, Samir, et al.
Published: (2025)
by: Abdaljalil, Samir, et al.
Published: (2025)
Halluverse-M^3: A multitask multilingual benchmark for hallucination in LLMs
by: Abdaljalil, Samir, et al.
Published: (2026)
by: Abdaljalil, Samir, et al.
Published: (2026)
Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference
by: Abdaljalil, Samir, et al.
Published: (2025)
by: Abdaljalil, Samir, et al.
Published: (2025)
Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models
by: Abdaljalil, Samir, et al.
Published: (2025)
by: Abdaljalil, Samir, et al.
Published: (2025)
Audit-of-Understanding: Posterior-Constrained Inference for Mathematical Reasoning in Language Models
by: Abdaljalil, Samir, et al.
Published: (2025)
by: Abdaljalil, Samir, et al.
Published: (2025)
Stress-Testing Multimodal Foundation Models for Crystallographic Reasoning
by: Polat, Can, et al.
Published: (2025)
by: Polat, Can, et al.
Published: (2025)
4D Synchronized Fields: Motion-Language Gaussian Splatting for Temporal Scene Understanding
by: Barhdadi, Mohamed Rayan, et al.
Published: (2026)
by: Barhdadi, Mohamed Rayan, et al.
Published: (2026)
SCALAR: Quantifying Structural Hallucination, Consistency, and Reasoning Gaps in Materials Foundation Models
by: Polat, Can, et al.
Published: (2026)
by: Polat, Can, et al.
Published: (2026)
Multilingual Prompt Localization for Agent-as-a-Judge: Language and Backbone Sensitivity in Requirement-Level Evaluation
by: Mahmood, Alhasan, et al.
Published: (2026)
by: Mahmood, Alhasan, et al.
Published: (2026)
Designing Role Vectors to Improve LLM Inference Behaviour
by: Potertì, Daniele, et al.
Published: (2025)
by: Potertì, Daniele, et al.
Published: (2025)
QuantumCanvas: A Multimodal Benchmark for Visual Learning of Atomic Interactions
by: Polat, Can, et al.
Published: (2025)
by: Polat, Can, et al.
Published: (2025)
XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models
by: Cambria, Erik, et al.
Published: (2024)
by: Cambria, Erik, et al.
Published: (2024)
Disce aut Deficere: Evaluating LLMs Proficiency on the INVALSI Italian Benchmark
by: Mercorio, Fabio, et al.
Published: (2024)
by: Mercorio, Fabio, et al.
Published: (2024)
Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework
by: Polat, Can, et al.
Published: (2025)
by: Polat, Can, et al.
Published: (2025)
xChemAgents: Agentic AI for Explainable Quantum Chemistry
by: Polat, Can, et al.
Published: (2025)
by: Polat, Can, et al.
Published: (2025)
C2NP: A Benchmark for Learning Scale-Dependent Geometric Invariances in 3D Materials Generation
by: Polat, Can, et al.
Published: (2026)
by: Polat, Can, et al.
Published: (2026)
Understanding the Capabilities of Molecular Graph Neural Networks in Materials Science Through Multimodal Learning and Physical Context Encoding
by: Polat, Can, et al.
Published: (2025)
by: Polat, Can, et al.
Published: (2025)
How Far Can You Grow? Characterizing the Extrapolation Frontier of Graph Generative Models for Materials Science
by: Polat, Can, et al.
Published: (2026)
by: Polat, Can, et al.
Published: (2026)
IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video
by: Khanbayov, Rasul, et al.
Published: (2026)
by: Khanbayov, Rasul, et al.
Published: (2026)
EMPATHIA: Multi-Faceted Human-AI Collaboration for Refugee Integration
by: Barhdadi, Mohamed Rayan, et al.
Published: (2025)
by: Barhdadi, Mohamed Rayan, et al.
Published: (2025)
Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation
by: Hua, Zhenglin, et al.
Published: (2025)
by: Hua, Zhenglin, et al.
Published: (2025)
SASFT: Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs
by: Deng, Boyi, et al.
Published: (2025)
by: Deng, Boyi, et al.
Published: (2025)
SAFE-MEME: Structured Reasoning Framework for Robust Hate Speech Detection in Memes
by: Nandi, Palash, et al.
Published: (2024)
by: Nandi, Palash, et al.
Published: (2024)
Sparse Autoencoders are Capable LLM Jailbreak Mitigators
by: Assogba, Yannick, et al.
Published: (2026)
by: Assogba, Yannick, et al.
Published: (2026)
Joint Sensor Deployment and Physics-Informed Graph Transformer for Smart Grid Attack Detection
by: Elnour, Mariam, et al.
Published: (2026)
by: Elnour, Mariam, et al.
Published: (2026)
A Concise Review of Hallucinations in LLMs and their Mitigation
by: Pulkundwar, Parth, et al.
Published: (2025)
by: Pulkundwar, Parth, et al.
Published: (2025)
Med-HEAL: Analyzing and Mitigating Hallucinations in Medical LLMs with Hallucination-Aware In-Context Learning
by: Liao, Yiming, et al.
Published: (2026)
by: Liao, Yiming, et al.
Published: (2026)
No One Size Fits All: QueryBandits for Hallucination Mitigation
by: Cho, Nicole, et al.
Published: (2026)
by: Cho, Nicole, et al.
Published: (2026)
Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders
by: Goyal, Agam, et al.
Published: (2025)
by: Goyal, Agam, et al.
Published: (2025)
SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs
by: Härle, Ruben, et al.
Published: (2024)
by: Härle, Ruben, et al.
Published: (2024)
Rowen: Adaptive Retrieval-Augmented Generation for Hallucination Mitigation in LLMs
by: Ding, Hanxing, et al.
Published: (2024)
by: Ding, Hanxing, et al.
Published: (2024)
QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting
by: Cho, Nicole, et al.
Published: (2025)
by: Cho, Nicole, et al.
Published: (2025)
Towards Understanding the Robustness of Sparse Autoencoders
by: Saiyed, Ahson, et al.
Published: (2026)
by: Saiyed, Ahson, et al.
Published: (2026)
Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders
by: Wu, Xuansheng, et al.
Published: (2025)
by: Wu, Xuansheng, et al.
Published: (2025)
Uncovering Cross-Linguistic Disparities in LLMs using Sparse Autoencoders
by: Xuan, Richmond Sin Jing, et al.
Published: (2025)
by: Xuan, Richmond Sin Jing, et al.
Published: (2025)
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
by: Sun, Zhongxiang, et al.
Published: (2026)
by: Sun, Zhongxiang, et al.
Published: (2026)
Natural Language Querying System Through Entity Enrichment
by: Amavi, Joshua, et al.
Published: (2024)
by: Amavi, Joshua, et al.
Published: (2024)
Mitigating Object Hallucination via Robust Local Perception Search
by: Gao, Zixian, et al.
Published: (2025)
by: Gao, Zixian, et al.
Published: (2025)
Similar Items
-
HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
by: Abdaljalil, Samir, et al.
Published: (2025) -
Knowing When Not to Answer: Abstention-Aware Scientific Reasoning
by: Abdaljalil, Samir, et al.
Published: (2026) -
SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs
by: Abdaljalil, Samir, et al.
Published: (2025) -
Halluverse-M^3: A multitask multilingual benchmark for hallucination in LLMs
by: Abdaljalil, Samir, et al.
Published: (2026) -
Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference
by: Abdaljalil, Samir, et al.
Published: (2025)