Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Surana, Harshul Raj, Maji, Arijit, Vats, Aryan, Ghosh, Akash, Saha, Sriparna, Sheth, Amit
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Information Retrieval I.2.7; H.3.3
Online Access:	https://arxiv.org/abs/2602.18429
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911459830661120
author	Surana, Harshul Raj Maji, Arijit Vats, Aryan Ghosh, Akash Saha, Sriparna Sheth, Amit
author_facet	Surana, Harshul Raj Maji, Arijit Vats, Aryan Ghosh, Akash Saha, Sriparna Sheth, Amit
contents	Large Language Models (LLMs) have made significant progress in reasoning tasks across various domains such as mathematics and coding. However, their performance deteriorates in tasks requiring rich socio-cultural knowledge and diverse local contexts, particularly those involving Indian Culture. Existing Cultural benchmarks are (i) Manually crafted, (ii) contain single-hop questions testing factual recall, and (iii) prohibitively costly to scale, leaving this deficiency largely unmeasured. To address this, we introduce VIRAASAT, a novel, semi-automated multi-hop approach for generating cultural specific multi-hop Question-Answering dataset for Indian culture. VIRAASAT leverages a Knowledge Graph comprising more than 700 expert-curated cultural artifacts, covering 13 key attributes of Indian culture (history, festivals, etc). VIRAASAT spans all 28 states and 8 Union Territories, yielding more than 3,200 multi-hop questions that necessitate chained cultural reasoning. We evaluate current State-of-the-Art (SOTA) LLMs on VIRAASAT and identify key limitations in reasoning wherein fine-tuning on Chain-of-Thought(CoT) traces fails to ground and synthesize low-probability facts. To bridge this gap, we propose a novel framework named Symbolic Chain-of-Manipulation (SCoM). Adapting the Chain-of-Manipulation paradigm, we train the model to simulate atomic Knowledge Graph manipulations internally. SCoM teaches the model to reliably traverse the topological structure of the graph. Experiments on Supervised Fine-Tuning (SFT) demonstrate that SCoM outperforms standard CoT baselines by up to 20%. We release the VIRAASAT dataset along with our findings, laying a strong foundation towards building Culturally Aware Reasoning Models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_18429
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	VIRAASAT: Traversing Novel Paths for Indian Cultural Reasoning Surana, Harshul Raj Maji, Arijit Vats, Aryan Ghosh, Akash Saha, Sriparna Sheth, Amit Computation and Language Information Retrieval I.2.7; H.3.3 Large Language Models (LLMs) have made significant progress in reasoning tasks across various domains such as mathematics and coding. However, their performance deteriorates in tasks requiring rich socio-cultural knowledge and diverse local contexts, particularly those involving Indian Culture. Existing Cultural benchmarks are (i) Manually crafted, (ii) contain single-hop questions testing factual recall, and (iii) prohibitively costly to scale, leaving this deficiency largely unmeasured. To address this, we introduce VIRAASAT, a novel, semi-automated multi-hop approach for generating cultural specific multi-hop Question-Answering dataset for Indian culture. VIRAASAT leverages a Knowledge Graph comprising more than 700 expert-curated cultural artifacts, covering 13 key attributes of Indian culture (history, festivals, etc). VIRAASAT spans all 28 states and 8 Union Territories, yielding more than 3,200 multi-hop questions that necessitate chained cultural reasoning. We evaluate current State-of-the-Art (SOTA) LLMs on VIRAASAT and identify key limitations in reasoning wherein fine-tuning on Chain-of-Thought(CoT) traces fails to ground and synthesize low-probability facts. To bridge this gap, we propose a novel framework named Symbolic Chain-of-Manipulation (SCoM). Adapting the Chain-of-Manipulation paradigm, we train the model to simulate atomic Knowledge Graph manipulations internally. SCoM teaches the model to reliably traverse the topological structure of the graph. Experiments on Supervised Fine-Tuning (SFT) demonstrate that SCoM outperforms standard CoT baselines by up to 20%. We release the VIRAASAT dataset along with our findings, laying a strong foundation towards building Culturally Aware Reasoning Models.
title	VIRAASAT: Traversing Novel Paths for Indian Cultural Reasoning
topic	Computation and Language Information Retrieval I.2.7; H.3.3
url	https://arxiv.org/abs/2602.18429

Similar Items