Saved in:
| Main Authors: | Sahoo, Subramanyam, Jain, Vinija, Vats, Saanidhya, Mohapatra, Siddharth, Min, Rui, Chadha, Aman, Chaudhary, Divya |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.00552 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness
by: Sahoo, Subramanyam, et al.
Published: (2026)
by: Sahoo, Subramanyam, et al.
Published: (2026)
When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning
by: Sahoo, Subramanyam, et al.
Published: (2026)
by: Sahoo, Subramanyam, et al.
Published: (2026)
I Can't Believe It's Not Robust: Catastrophic Collapse of Safety Classifiers under Embedding Drift
by: Sahoo, Subramanyam, et al.
Published: (2026)
by: Sahoo, Subramanyam, et al.
Published: (2026)
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement
by: Sahoo, Subramanyam, et al.
Published: (2026)
by: Sahoo, Subramanyam, et al.
Published: (2026)
Dial E for Ethical Enforcement: institutional VETO power as a governance primitive
by: Sahoo, Subramanyam, et al.
Published: (2026)
by: Sahoo, Subramanyam, et al.
Published: (2026)
Position: The Complexity of Perfect AI Alignment -- Formalizing the RLHF Trilemma
by: Sahoo, Subramanyam, et al.
Published: (2025)
by: Sahoo, Subramanyam, et al.
Published: (2025)
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
by: Mahalingam, Aakash, et al.
Published: (2024)
by: Mahalingam, Aakash, et al.
Published: (2024)
Multilingual State Space Models for Structured Question Answering in Indic Languages
by: Vats, Arpita, et al.
Published: (2025)
by: Vats, Arpita, et al.
Published: (2025)
SEPSIS: I Can Catch Your Lies -- A New Paradigm for Deception Detection
by: Rani, Anku, et al.
Published: (2023)
by: Rani, Anku, et al.
Published: (2023)
Are Small Language Models Ready to Compete with Large Language Models for Practical Applications?
by: Sinha, Neelabh, et al.
Published: (2024)
by: Sinha, Neelabh, et al.
Published: (2024)
Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types
by: Sinha, Neelabh, et al.
Published: (2024)
by: Sinha, Neelabh, et al.
Published: (2024)
How Culturally Aware are Vision-Language Models?
by: Burda-Lassen, Olena, et al.
Published: (2024)
by: Burda-Lassen, Olena, et al.
Published: (2024)
Exploring the Impact of Large Language Models on Recommender Systems: An Extensive Review
by: Vats, Arpita, et al.
Published: (2024)
by: Vats, Arpita, et al.
Published: (2024)
Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models
by: Kasat, Aryan, et al.
Published: (2026)
by: Kasat, Aryan, et al.
Published: (2026)
Born With a Silver Spoon? Investigating Socioeconomic Bias in Large Language Models
by: Singh, Smriti, et al.
Published: (2024)
by: Singh, Smriti, et al.
Published: (2024)
A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications
by: Sahoo, Pranab, et al.
Published: (2024)
by: Sahoo, Pranab, et al.
Published: (2024)
A Comprehensive Survey of Hallucination in Large Language, Image, Video and Audio Foundation Models
by: Sahoo, Pranab, et al.
Published: (2024)
by: Sahoo, Pranab, et al.
Published: (2024)
Personality Shapes Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi: An Empirical Investigation
by: Kumar, Tanay, et al.
Published: (2026)
by: Kumar, Tanay, et al.
Published: (2026)
A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
by: Khoshnoodi, Mahsa, et al.
Published: (2024)
by: Khoshnoodi, Mahsa, et al.
Published: (2024)
Assessing LLM Reliability on Temporally Recent Open-Domain Questions
by: Krishnappa, Pushwitha, et al.
Published: (2026)
by: Krishnappa, Pushwitha, et al.
Published: (2026)
Neural FOXP2 -- Language Specific Neuron Steering for Targeted Language Improvement in LLMs
by: Saha, Anusa, et al.
Published: (2026)
by: Saha, Anusa, et al.
Published: (2026)
Decoding the Diversity: A Review of the Indic AI Research Landscape
by: KJ, Sankalp, et al.
Published: (2024)
by: KJ, Sankalp, et al.
Published: (2024)
Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions
by: Ghosh, Akash, et al.
Published: (2024)
by: Ghosh, Akash, et al.
Published: (2024)
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
by: Pawar, Saurav, et al.
Published: (2024)
by: Pawar, Saurav, et al.
Published: (2024)
LLMsAgainstHate @ NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs
by: Sidibomma, Rushendra, et al.
Published: (2024)
by: Sidibomma, Rushendra, et al.
Published: (2024)
Calibration Collapse Under Sycophancy Fine-Tuning: How Reward Hacking Breaks Uncertainty Quantification in LLMs
by: Sahoo, Subramanyam
Published: (2026)
by: Sahoo, Subramanyam
Published: (2026)
MAAT: Multi-phase Adapter-Aware Targeted Unlearning
by: Yagnik, Suryash, et al.
Published: (2026)
by: Yagnik, Suryash, et al.
Published: (2026)
PHAnToM: Persona-based Prompting Has An Effect on Theory-of-Mind Reasoning in Large Language Models
by: Tan, Fiona Anting, et al.
Published: (2024)
by: Tan, Fiona Anting, et al.
Published: (2024)
Parameter Efficient Fine Tuning: A Comprehensive Analysis Across Applications
by: Balne, Charith Chandra Sai, et al.
Published: (2024)
by: Balne, Charith Chandra Sai, et al.
Published: (2024)
TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs
by: Das, Amitava, et al.
Published: (2025)
by: Das, Amitava, et al.
Published: (2025)
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive Principles
by: Budagam, Devichand, et al.
Published: (2024)
by: Budagam, Devichand, et al.
Published: (2024)
From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings
by: Rakshit, Aishik, et al.
Published: (2024)
by: Rakshit, Aishik, et al.
Published: (2024)
SPINAL -- Scaling-law and Preference Integration in Neural Alignment Layers
by: Das, Arion, et al.
Published: (2026)
by: Das, Arion, et al.
Published: (2026)
IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding
by: KJ, Sankalp, et al.
Published: (2025)
by: KJ, Sankalp, et al.
Published: (2025)
PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training
by: Kumar, Harsh, et al.
Published: (2026)
by: Kumar, Harsh, et al.
Published: (2026)
Catch Me If You Can Describe Me: Open-Vocabulary Camouflaged Instance Segmentation with Diffusion
by: Vu, Tuan-Anh, et al.
Published: (2023)
by: Vu, Tuan-Anh, et al.
Published: (2023)
On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models
by: Wijesiriwardene, Thilini, et al.
Published: (2023)
by: Wijesiriwardene, Thilini, et al.
Published: (2023)
AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models
by: Mukhopadhyay, Snehasis, et al.
Published: (2025)
by: Mukhopadhyay, Snehasis, et al.
Published: (2025)
Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models
by: Gangwar, Neeraj, et al.
Published: (2025)
by: Gangwar, Neeraj, et al.
Published: (2025)
Cause and Effect: Can Large Language Models Truly Understand Causality?
by: Ashwani, Swagata, et al.
Published: (2024)
by: Ashwani, Swagata, et al.
Published: (2024)
Similar Items
-
The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness
by: Sahoo, Subramanyam, et al.
Published: (2026) -
When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning
by: Sahoo, Subramanyam, et al.
Published: (2026) -
I Can't Believe It's Not Robust: Catastrophic Collapse of Safety Classifiers under Embedding Drift
by: Sahoo, Subramanyam, et al.
Published: (2026) -
SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement
by: Sahoo, Subramanyam, et al.
Published: (2026) -
Dial E for Ethical Enforcement: institutional VETO power as a governance primitive
by: Sahoo, Subramanyam, et al.
Published: (2026)