Saved in:
| Main Authors: | He, Jiahang, Ramachandran, Rishi, Ramachandran, Neel, Katakam, Aryan, Zhu, Kevin, Dev, Sunishchal, Panda, Ashwinee, Shrivastava, Aryan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.10688 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Linearly Decoding Refused Knowledge in Aligned Language Models
by: Shrivastava, Aryan, et al.
Published: (2025)
by: Shrivastava, Aryan, et al.
Published: (2025)
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
by: Shrivastava, Aryan, et al.
Published: (2025)
by: Shrivastava, Aryan, et al.
Published: (2025)
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
by: Jain, Neel, et al.
Published: (2024)
by: Jain, Neel, et al.
Published: (2024)
Measuring Free-Form Decision-Making Inconsistency of Language Models in Military Crisis Simulations
by: Shrivastava, Aryan, et al.
Published: (2024)
by: Shrivastava, Aryan, et al.
Published: (2024)
SALT: Steering Activations towards Leakage-free Thinking in Chain of Thought
by: Batra, Shourya, et al.
Published: (2025)
by: Batra, Shourya, et al.
Published: (2025)
Advancing Reasoning in Large Language Models: Promising Methods and Approaches
by: Patil, Avinash, et al.
Published: (2025)
by: Patil, Avinash, et al.
Published: (2025)
Resource-Aware Arabic LLM Creation: Model Adaptation, Integration, and Multi-Domain Testing
by: Aryan, Prakash
Published: (2024)
by: Aryan, Prakash
Published: (2024)
Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs
by: Afonin, Nikita, et al.
Published: (2025)
by: Afonin, Nikita, et al.
Published: (2025)
Structure First, Reason Next: Enhancing a Large Language Model using Knowledge Graph for Numerical Reasoning in Financial Documents
by: Mishra, Aryan, et al.
Published: (2026)
by: Mishra, Aryan, et al.
Published: (2026)
A Comparative Study of Translation Bias and Accuracy in Multilingual Large Language Models for Cross-Language Claim Verification
by: Singhal, Aryan, et al.
Published: (2024)
by: Singhal, Aryan, et al.
Published: (2024)
Causal Reflection with Language Models
by: Aryan, Abi, et al.
Published: (2025)
by: Aryan, Abi, et al.
Published: (2025)
AbsenceBench: Language Models Can't Tell What's Missing
by: Fu, Harvey Yiyun, et al.
Published: (2025)
by: Fu, Harvey Yiyun, et al.
Published: (2025)
Adapting Biomedical Abstracts into Plain language using Large Language Models
by: Gangavarapu, Haritha, et al.
Published: (2025)
by: Gangavarapu, Haritha, et al.
Published: (2025)
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
by: Tang, Xinyu, et al.
Published: (2024)
by: Tang, Xinyu, et al.
Published: (2024)
English Please: Evaluating Machine Translation with Large Language Models for Multilingual Bug Reports
by: Patil, Avinash, et al.
Published: (2025)
by: Patil, Avinash, et al.
Published: (2025)
Large Language Models aren't all that you need
by: Holla, Kiran Voderhobli, et al.
Published: (2024)
by: Holla, Kiran Voderhobli, et al.
Published: (2024)
Know Thyself? On the Incapability and Implications of AI Self-Recognition
by: Bai, Xiaoyan, et al.
Published: (2025)
by: Bai, Xiaoyan, et al.
Published: (2025)
Signal or Noise? Evaluating Large Language Models in Resume Screening Across Contextual Variations and Human Expert Benchmarks
by: Varshney, Aryan, et al.
Published: (2025)
by: Varshney, Aryan, et al.
Published: (2025)
Amortized Latent Steering: Low-Cost Alternative to Test-Time Optimization
by: Egbuna, Nathan, et al.
Published: (2025)
by: Egbuna, Nathan, et al.
Published: (2025)
Alignment-Constrained Dynamic Pruning for LLMs: Identifying and Preserving Alignment-Critical Circuits
by: Patel, Dev, et al.
Published: (2025)
by: Patel, Dev, et al.
Published: (2025)
AI Based Font Pair Suggestion Modelling For Graphic Design
by: Singh, Aryan, et al.
Published: (2025)
by: Singh, Aryan, et al.
Published: (2025)
Multi-Token Prediction via Self-Distillation
by: Kirchenbauer, John, et al.
Published: (2026)
by: Kirchenbauer, John, et al.
Published: (2026)
Visualizing and Benchmarking LLM Factual Hallucination Tendencies via Internal State Analysis and Clustering
by: Mao, Nathan, et al.
Published: (2026)
by: Mao, Nathan, et al.
Published: (2026)
Grounded Concreteness: Human-Like Concreteness Sensitivity in Vision-Language Models
by: Roy, Aryan, et al.
Published: (2026)
by: Roy, Aryan, et al.
Published: (2026)
The Text Uncanny Valley: Non-Monotonic Performance Degradation in LLM Information Retrieval
by: Tong, Zekai, et al.
Published: (2026)
by: Tong, Zekai, et al.
Published: (2026)
COMPASS: Context-Modulated PID Attention Steering System for Hallucination Mitigation
by: Sahay, Kenji, et al.
Published: (2025)
by: Sahay, Kenji, et al.
Published: (2025)
Investigating Spatial Attention Bias in Vision-Language Models
by: Chaudhary, Aryan, et al.
Published: (2025)
by: Chaudhary, Aryan, et al.
Published: (2025)
Is Training Data Quality or Quantity More Impactful to Small Language Model Performance?
by: Sajith, Aryan, et al.
Published: (2024)
by: Sajith, Aryan, et al.
Published: (2024)
LLMs as Debate Partners: Utilizing Genetic Algorithms and Adversarial Search for Adaptive Arguments
by: Aryan, Prakash
Published: (2024)
by: Aryan, Prakash
Published: (2024)
Illuminate: A novel approach for depression detection with explainable analysis and proactive therapy using prompt engineering
by: Agrawal, Aryan
Published: (2024)
by: Agrawal, Aryan
Published: (2024)
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
by: Zhang, Juzheng, et al.
Published: (2025)
by: Zhang, Juzheng, et al.
Published: (2025)
Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models
by: Lin, Zizhuo, et al.
Published: (2026)
by: Lin, Zizhuo, et al.
Published: (2026)
DebateBench: A Challenging Long Context Reasoning Benchmark For Large Language Models
by: Tiwari, Utkarsh, et al.
Published: (2025)
by: Tiwari, Utkarsh, et al.
Published: (2025)
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
by: Chhikara, Prateek, et al.
Published: (2025)
by: Chhikara, Prateek, et al.
Published: (2025)
Teach LLMs to Phish: Stealing Private Information from Language Models
by: Panda, Ashwinee, et al.
Published: (2024)
by: Panda, Ashwinee, et al.
Published: (2024)
TurnBench-MS: A Benchmark for Evaluating Multi-Turn, Multi-Step Reasoning in Large Language Models
by: Zhang, Yiran, et al.
Published: (2025)
by: Zhang, Yiran, et al.
Published: (2025)
ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs
by: Thomas, Rohan Subramanian, et al.
Published: (2026)
by: Thomas, Rohan Subramanian, et al.
Published: (2026)
TRUTH DECAY: Quantifying Multi-Turn Sycophancy in Language Models
by: Liu, Joshua, et al.
Published: (2025)
by: Liu, Joshua, et al.
Published: (2025)
Align to Structure: Aligning Large Language Models with Structural Information
by: Kim, Zae Myung, et al.
Published: (2025)
by: Kim, Zae Myung, et al.
Published: (2025)
CARE-RFT: Confidence-Anchored Reinforcement Finetuning for Reliable Reasoning in Large Language Models
by: Li, Shuozhe, et al.
Published: (2026)
by: Li, Shuozhe, et al.
Published: (2026)
Similar Items
-
Linearly Decoding Refused Knowledge in Aligned Language Models
by: Shrivastava, Aryan, et al.
Published: (2025) -
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
by: Shrivastava, Aryan, et al.
Published: (2025) -
Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
by: Jain, Neel, et al.
Published: (2024) -
Measuring Free-Form Decision-Making Inconsistency of Language Models in Military Crisis Simulations
by: Shrivastava, Aryan, et al.
Published: (2024) -
SALT: Steering Activations towards Leakage-free Thinking in Chain of Thought
by: Batra, Shourya, et al.
Published: (2025)