:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Das, Amit, Rahgouy, Mostafa, Feng, Dongji, Zhang, Zheng, Bhattacharya, Tathagata, Raychawdhary, Nilanjana, Jamshidi, Fatemeh, Jain, Vinija, Chadha, Aman, Sandage, Mary, Pope, Lauramarie, Dozier, Gerry, Seals, Cheryl
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2403.02472
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Investigating Annotator Bias in Large Language Models for Hate Speech Detection
by: Das, Amit, et al.
Published: (2024)

Investigating Hallucination in Conversations for Low Resource Languages
by: Das, Amit, et al.
Published: (2025)

Towards Effective Authorship Attribution: Integrating Class-Incremental Learning
by: Rahgouy, Mostafa, et al.
Published: (2024)

Assessing LLM Reliability on Temporally Recent Open-Domain Questions
by: Krishnappa, Pushwitha, et al.
Published: (2026)

TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs
by: Das, Amitava, et al.
Published: (2025)

Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types
by: Sinha, Neelabh, et al.
Published: (2024)

Are Small Language Models Ready to Compete with Large Language Models for Practical Applications?
by: Sinha, Neelabh, et al.
Published: (2024)

When Shallow Wins: Silent Failures and the Depth-Accuracy Paradox in Latent Reasoning
by: Sahoo, Subramanyam, et al.
Published: (2026)

Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models
by: Kasat, Aryan, et al.
Published: (2026)

Dial E for Ethical Enforcement: institutional VETO power as a governance primitive
by: Sahoo, Subramanyam, et al.
Published: (2026)

The Reasoning Trap -- Logical Reasoning as a Mechanistic Pathway to Situational Awareness
by: Sahoo, Subramanyam, et al.
Published: (2026)

Born With a Silver Spoon? Investigating Socioeconomic Bias in Large Language Models
by: Singh, Smriti, et al.
Published: (2024)

AlignGuard-LoRA: Alignment-Preserving Fine-Tuning via Fisher-Guided Decomposition and Riemannian-Geodesic Collision Regularization
by: Das, Amitava, et al.
Published: (2025)

SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement
by: Sahoo, Subramanyam, et al.
Published: (2026)

I Can't Believe It's Not Robust: Catastrophic Collapse of Safety Classifiers under Embedding Drift
by: Sahoo, Subramanyam, et al.
Published: (2026)

Position: The Complexity of Perfect AI Alignment -- Formalizing the RLHF Trilemma
by: Sahoo, Subramanyam, et al.
Published: (2025)

Exploring the Impact of Large Language Models on Recommender Systems: An Extensive Review
by: Vats, Arpita, et al.
Published: (2024)

The Civilising Offensive
Published: (2022)

LLM for Complex Reasoning Task: An Exploratory Study in Fermi Problems
by: Liu, Zishuo, et al.
Published: (2025)

How Culturally Aware are Vision-Language Models?
by: Burda-Lassen, Olena, et al.
Published: (2024)

Offensive Robot Cybersecurity
by: Mayoral-Vilches, Víctor
Published: (2025)

SOMALIA: Puntland Offensive
Published: (2025)

SPINAL -- Scaling-law and Preference Integration in Neural Alignment Layers
by: Das, Arion, et al.
Published: (2026)

Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions
by: Ghosh, Akash, et al.
Published: (2024)

D-STEER - Preference Alignment Techniques Learn to Behave, not to Believe -- Beneath the Surface, DPO as Steering Vector Perturbation in Activation Space
by: Raina, Samarth, et al.
Published: (2025)

AlignMerge - Alignment-Preserving Large Language Model Merging via Fisher-Guided Geometric Constraints
by: Roy, Aniruddha, et al.
Published: (2025)

A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
by: Khoshnoodi, Mahsa, et al.
Published: (2024)

Neural FOXP2 -- Language Specific Neuron Steering for Targeted Language Improvement in LLMs
by: Saha, Anusa, et al.
Published: (2026)

ECLIPTICA -- A Framework for Switchable LLM Alignment via CITA - Contrastive Instruction-Tuned Alignment
by: Wanaskar, Kapil, et al.
Published: (2026)

Personality Shapes Gender Bias in Persona-Conditioned LLM Narratives Across English and Hindi: An Empirical Investigation
by: Kumar, Tanay, et al.
Published: (2026)

Multilingual State Space Models for Structured Question Answering in Indic Languages
by: Vats, Arpita, et al.
Published: (2025)

MOD-X: A Modular Open Decentralized eXchange Framework proposal for Heterogeneous Interoperable Artificial Intelligence Agents
by: Ioannides, Georgios, et al.
Published: (2025)

Decoding the Diversity: A Review of the Indic AI Research Landscape
by: KJ, Sankalp, et al.
Published: (2024)

Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation
by: Lakhanpal, Sanyam, et al.
Published: (2024)

Offensive Lineup Analysis in Basketball with Clustering Players Based on Shooting Style and Offensive Role
by: Yamada, Kazuhiro, et al.
Published: (2024)

What is the AGI in Offensive Security ?
by: Cho, Youngwoong
Published: (2026)

Responsible Development of Offensive AI
by: Marinelli, Ryan
Published: (2025)

SOMALIA: Twin Offensives Continue
Published: (2026)

SOMALIA: New Puntland Offensive
Published: (2024)

The Public Library: Offensive by Design.
by: Sumerford, Steve
Published: (1987)