Saved in:
| Main Authors: | Roy, Soham, Halder, Sarthakbrata, Bharaty, Arya, Bhaskar, Vaibhav, Sinha, Yash, Kumar, Dhruv, Panda, Srikant, Mandal, Murari |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.00497 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond Accuracy: Diagnosing Algebraic Reasoning Failures in LLMs Across Nine Complexity Dimensions
by: Patil, Parth, et al.
Published: (2026)
by: Patil, Parth, et al.
Published: (2026)
Measuring Representation Robustness in Large Language Models for Geometry
by: Jawandhia, Vedant, et al.
Published: (2026)
by: Jawandhia, Vedant, et al.
Published: (2026)
Distill to Delete: Unlearning in Graph Networks with Knowledge Distillation
by: Sinha, Yash, et al.
Published: (2023)
by: Sinha, Yash, et al.
Published: (2023)
Multi-Modal Recommendation Unlearning for Legal, Licensing, and Modality Constraints
by: Sinha, Yash, et al.
Published: (2024)
by: Sinha, Yash, et al.
Published: (2024)
UnStar: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs
by: Sinha, Yash, et al.
Published: (2024)
by: Sinha, Yash, et al.
Published: (2024)
How to Trick Your AI TA: A Systematic Study of Academic Jailbreaking in LLM Code Evaluation
by: Sahoo, Devanshu, et al.
Published: (2025)
by: Sahoo, Devanshu, et al.
Published: (2025)
ScamFerret: Detecting Scam Websites Autonomously with Large Language Models
by: Nakano, Hiroki, et al.
Published: (2025)
by: Nakano, Hiroki, et al.
Published: (2025)
CricBench: A Multilingual Benchmark for Evaluating LLMs in Cricket Analytics
by: Agarwal, Parth, et al.
Published: (2025)
by: Agarwal, Parth, et al.
Published: (2025)
Guardians of Generation: Dynamic Inference-Time Copyright Shielding with Adaptive Guidance for AI Image Generation
by: Roy, Soham, et al.
Published: (2025)
by: Roy, Soham, et al.
Published: (2025)
The Compliance Paradox: Semantic-Instruction Decoupling in Automated Academic Code Evaluation
by: Sahoo, Devanshu, et al.
Published: (2026)
by: Sahoo, Devanshu, et al.
Published: (2026)
When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection
by: Sahoo, Devanshu, et al.
Published: (2025)
by: Sahoo, Devanshu, et al.
Published: (2025)
Step-by-Step Reasoning Attack: Revealing 'Erased' Knowledge in Large Language Models
by: Sinha, Yash, et al.
Published: (2025)
by: Sinha, Yash, et al.
Published: (2025)
LLM-as-a-Judge for Time Series Explanations
by: Sivalingam, Preetham, et al.
Published: (2026)
by: Sivalingam, Preetham, et al.
Published: (2026)
BITS Pilani at SemEval-2026 Task 9: Structured Supervised Fine-Tuning with DPO Refinement for Polarization Detection
by: Gupta, Atharva, et al.
Published: (2026)
by: Gupta, Atharva, et al.
Published: (2026)
WebPII: Benchmarking Visual PII Detection for Computer-Use Agents
by: Zhao, Nathan
Published: (2026)
by: Zhao, Nathan
Published: (2026)
Measuring Web Accessibility Dimensions: An Evaluation of Understandability and Robustness in IIT Library Websites
by: Panda, Subhajit, et al.
Published: (2026)
by: Panda, Subhajit, et al.
Published: (2026)
Say It Differently: Linguistic Styles as Jailbreak Vectors
by: Panda, Srikant, et al.
Published: (2025)
by: Panda, Srikant, et al.
Published: (2025)
OrgAccess: A Benchmark for Role Based Access Control in Organization Scale LLMs
by: Sanyal, Debdeep, et al.
Published: (2025)
by: Sanyal, Debdeep, et al.
Published: (2025)
Discovering Universal Activation Directions for PII Leakage in Language Models
by: Marchyok, Leo, et al.
Published: (2026)
by: Marchyok, Leo, et al.
Published: (2026)
Confidence is Not Competence
by: Sanyal, Debdeep, et al.
Published: (2025)
by: Sanyal, Debdeep, et al.
Published: (2025)
time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models
by: Sanyal, Debdeep, et al.
Published: (2025)
by: Sanyal, Debdeep, et al.
Published: (2025)
A trailing lognormal approximation of the Lyman-$α$ forest: comparison with full hydrodynamic simulations at $2.2\leq z\leq 2.7$
by: Arya, Bhaskar
Published: (2025)
by: Arya, Bhaskar
Published: (2025)
Agents Are All You Need for LLM Unlearning
by: Sanyal, Debdeep, et al.
Published: (2025)
by: Sanyal, Debdeep, et al.
Published: (2025)
Policy Optimization Prefers The Path of Least Resistance
by: Sanyal, Debdeep, et al.
Published: (2025)
by: Sanyal, Debdeep, et al.
Published: (2025)
LOKI: Proactively Discovering Online Scam Websites by Mining Toxic Search Queries
by: Paudel, Pujan, et al.
Published: (2025)
by: Paudel, Pujan, et al.
Published: (2025)
Chain-of-Sanitized-Thoughts: Plugging PII Leakage in CoT of Large Reasoning Models
by: Das, Arghyadeep, et al.
Published: (2026)
by: Das, Arghyadeep, et al.
Published: (2026)
PII Jailbreaking in LLMs via Activation Steering Reveals Personal Information Leakage
by: Nakka, Krishna Kanth, et al.
Published: (2025)
by: Nakka, Krishna Kanth, et al.
Published: (2025)
ScamSweeper: Detecting Illegal Accounts in Web3 Scams via Transactions Analysis
by: Li, Xiaoqi, et al.
Published: (2025)
by: Li, Xiaoqi, et al.
Published: (2025)
FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding
by: Agarwal, Amit, et al.
Published: (2025)
by: Agarwal, Amit, et al.
Published: (2025)
Covariance matrices for the Lyman-$α$ forest using the lognormal approximation
by: Arya, Bhaskar, et al.
Published: (2023)
by: Arya, Bhaskar, et al.
Published: (2023)
A Theoretical Analysis of Soft-Label vs Hard-Label Training in Neural Networks
by: Mandal, Saptarshi, et al.
Published: (2024)
by: Mandal, Saptarshi, et al.
Published: (2024)
Convergence of Distributionally Robust Q-Learning with Linear Function Approximation
by: Mandal, Saptarshi, et al.
Published: (2025)
by: Mandal, Saptarshi, et al.
Published: (2025)
ReviewEval: An Evaluation Framework for AI-Generated Reviews
by: Garg, Madhav Krishan, et al.
Published: (2025)
by: Garg, Madhav Krishan, et al.
Published: (2025)
Auditing Disability Representation in Vision-Language Models
by: Panda, Srikant, et al.
Published: (2026)
by: Panda, Srikant, et al.
Published: (2026)
AccessEval: Benchmarking Disability Bias in Large Language Models
by: Panda, Srikant, et al.
Published: (2025)
by: Panda, Srikant, et al.
Published: (2025)
PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit PatcHing
by: Hughes, Anthony, et al.
Published: (2025)
by: Hughes, Anthony, et al.
Published: (2025)
Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
by: Shah, Arya, et al.
Published: (2025)
by: Shah, Arya, et al.
Published: (2025)
In Search of Goodness: Large Scale Benchmarking of Goodness Functions for the Forward-Forward Algorithm
by: Shah, Arya, et al.
Published: (2025)
by: Shah, Arya, et al.
Published: (2025)
AntiDote: Bi-level Adversarial Training for Tamper-Resistant LLMs
by: Sanyal, Debdeep, et al.
Published: (2025)
by: Sanyal, Debdeep, et al.
Published: (2025)
NagaNLP: Bootstrapping NLP for Low-Resource Nagamese Creole with Human-in-the-Loop Synthetic Data
by: Maiti, Agniva, et al.
Published: (2025)
by: Maiti, Agniva, et al.
Published: (2025)
Similar Items
-
Beyond Accuracy: Diagnosing Algebraic Reasoning Failures in LLMs Across Nine Complexity Dimensions
by: Patil, Parth, et al.
Published: (2026) -
Measuring Representation Robustness in Large Language Models for Geometry
by: Jawandhia, Vedant, et al.
Published: (2026) -
Distill to Delete: Unlearning in Graph Networks with Knowledge Distillation
by: Sinha, Yash, et al.
Published: (2023) -
Multi-Modal Recommendation Unlearning for Legal, Licensing, and Modality Constraints
by: Sinha, Yash, et al.
Published: (2024) -
UnStar: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs
by: Sinha, Yash, et al.
Published: (2024)