:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Abdaljalil, Samir, Pallucchini, Filippo, Seveso, Andrea, Kurban, Hasan, Mercorio, Fabio, Serpedin, Erchin
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2503.03032
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
by: Abdaljalil, Samir, et al.
Published: (2025)

Knowing When Not to Answer: Abstention-Aware Scientific Reasoning
by: Abdaljalil, Samir, et al.
Published: (2026)

SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs
by: Abdaljalil, Samir, et al.
Published: (2025)

Halluverse-M^3: A multitask multilingual benchmark for hallucination in LLMs
by: Abdaljalil, Samir, et al.
Published: (2026)

Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference
by: Abdaljalil, Samir, et al.
Published: (2025)

Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models
by: Abdaljalil, Samir, et al.
Published: (2025)

Audit-of-Understanding: Posterior-Constrained Inference for Mathematical Reasoning in Language Models
by: Abdaljalil, Samir, et al.
Published: (2025)

Stress-Testing Multimodal Foundation Models for Crystallographic Reasoning
by: Polat, Can, et al.
Published: (2025)

4D Synchronized Fields: Motion-Language Gaussian Splatting for Temporal Scene Understanding
by: Barhdadi, Mohamed Rayan, et al.
Published: (2026)

SCALAR: Quantifying Structural Hallucination, Consistency, and Reasoning Gaps in Materials Foundation Models
by: Polat, Can, et al.
Published: (2026)

Multilingual Prompt Localization for Agent-as-a-Judge: Language and Backbone Sensitivity in Requirement-Level Evaluation
by: Mahmood, Alhasan, et al.
Published: (2026)

Designing Role Vectors to Improve LLM Inference Behaviour
by: Potertì, Daniele, et al.
Published: (2025)

QuantumCanvas: A Multimodal Benchmark for Visual Learning of Atomic Interactions
by: Polat, Can, et al.
Published: (2025)

XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models
by: Cambria, Erik, et al.
Published: (2024)

Disce aut Deficere: Evaluating LLMs Proficiency on the INVALSI Italian Benchmark
by: Mercorio, Fabio, et al.
Published: (2024)

Beyond Atomic Geometry Representations in Materials Science: A Human-in-the-Loop Multimodal Framework
by: Polat, Can, et al.
Published: (2025)

xChemAgents: Agentic AI for Explainable Quantum Chemistry
by: Polat, Can, et al.
Published: (2025)

C2NP: A Benchmark for Learning Scale-Dependent Geometric Invariances in 3D Materials Generation
by: Polat, Can, et al.
Published: (2026)

Understanding the Capabilities of Molecular Graph Neural Networks in Materials Science Through Multimodal Learning and Physical Context Encoding
by: Polat, Can, et al.
Published: (2025)

How Far Can You Grow? Characterizing the Extrapolation Frontier of Graph Generative Models for Materials Science
by: Polat, Can, et al.
Published: (2026)

IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video
by: Khanbayov, Rasul, et al.
Published: (2026)

EMPATHIA: Multi-Faceted Human-AI Collaboration for Refugee Integration
by: Barhdadi, Mohamed Rayan, et al.
Published: (2025)

Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation
by: Hua, Zhenglin, et al.
Published: (2025)

SASFT: Sparse Autoencoder-guided Supervised Finetuning to Mitigate Unexpected Code-Switching in LLMs
by: Deng, Boyi, et al.
Published: (2025)

SAFE-MEME: Structured Reasoning Framework for Robust Hate Speech Detection in Memes
by: Nandi, Palash, et al.
Published: (2024)

Sparse Autoencoders are Capable LLM Jailbreak Mitigators
by: Assogba, Yannick, et al.
Published: (2026)

Joint Sensor Deployment and Physics-Informed Graph Transformer for Smart Grid Attack Detection
by: Elnour, Mariam, et al.
Published: (2026)

A Concise Review of Hallucinations in LLMs and their Mitigation
by: Pulkundwar, Parth, et al.
Published: (2025)

Med-HEAL: Analyzing and Mitigating Hallucinations in Medical LLMs with Hallucination-Aware In-Context Learning
by: Liao, Yiming, et al.
Published: (2026)

No One Size Fits All: QueryBandits for Hallucination Mitigation
by: Cho, Nicole, et al.
Published: (2026)

Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders
by: Goyal, Agam, et al.
Published: (2025)

SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs
by: Härle, Ruben, et al.
Published: (2024)

Rowen: Adaptive Retrieval-Augmented Generation for Hallucination Mitigation in LLMs
by: Ding, Hanxing, et al.
Published: (2024)

QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting
by: Cho, Nicole, et al.
Published: (2025)

Towards Understanding the Robustness of Sparse Autoencoders
by: Saiyed, Ahson, et al.
Published: (2026)

Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders
by: Wu, Xuansheng, et al.
Published: (2025)

Uncovering Cross-Linguistic Disparities in LLMs using Sparse Autoencoders
by: Xuan, Richmond Sin Jing, et al.
Published: (2025)

When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs
by: Sun, Zhongxiang, et al.
Published: (2026)

Natural Language Querying System Through Entity Enrichment
by: Amavi, Joshua, et al.
Published: (2024)

Mitigating Object Hallucination via Robust Local Perception Search
by: Gao, Zixian, et al.
Published: (2025)