:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Java, Abhinav, Shahid, Simra, Agarwal, Chirag
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2411.08506
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards Understanding the Robustness of Sparse Autoencoders
by: Saiyed, Ahson, et al.
Published: (2026)

Toward Understanding Unlearning Difficulty: A Mechanistic Perspective and Circuit-Guided Difficulty Metric
by: Cheng, Jiali, et al.
Published: (2026)

In-Context Explainers: Harnessing LLMs for Explaining Black Box Models
by: Kroeger, Nicholas, et al.
Published: (2023)

Do Students Debias Like Teachers? On the Distillability of Bias Mitigation Methods
by: Cheng, Jiali, et al.
Published: (2025)

Towards Quantifying Commonsense Reasoning with Mechanistic Insights
by: Joshi, Abhinav, et al.
Published: (2025)

Towards Robust Evaluation of Unlearning in LLMs via Data Transformations
by: Joshi, Abhinav, et al.
Published: (2024)

Agnostic Language Identification and Generation
by: Høgsgaard, Mikael Møller, et al.
Published: (2026)

Meursault as a Data Point
by: Pratap, Abhinav
Published: (2025)

Certifying LLM Safety against Adversarial Prompting
by: Kumar, Aounon, et al.
Published: (2023)

Neural Networks for Learnable and Scalable Influence Estimation of Instruction Fine-Tuning Data
by: Agarwal, Ishika, et al.
Published: (2025)

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
by: Majumder, Bodhisattwa Prasad, et al.
Published: (2024)

Thinking Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models
by: Furniturewala, Shaz, et al.
Published: (2024)

Feedback-Aware Monte Carlo Tree Search for Efficient Information Seeking in Goal-Oriented Conversations
by: Chopra, Harshita, et al.
Published: (2025)

How Reliable are Causal Probing Interventions?
by: Canby, Marc, et al.
Published: (2024)

LEAST: "Local" text-conditioned image style transfer
by: Singh, Silky, et al.
Published: (2024)

Rethinking Explainability in the Era of Multimodal AI
by: Agarwal, Chirag
Published: (2025)

The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning
by: He, Bingxiang, et al.
Published: (2024)

COLD: Causal reasOning in cLosed Daily activities
by: Joshi, Abhinav, et al.
Published: (2024)

Beyond Components: Singular Vector-Based Interpretability of Transformer Circuits
by: Ahmad, Areeb, et al.
Published: (2025)

Geometry of Decision Making in Language Models
by: Joshi, Abhinav, et al.
Published: (2025)

Calibration Across Layers: Understanding Calibration Evolution in LLMs
by: Joshi, Abhinav, et al.
Published: (2025)

Exploring Facets of Language Generation in the Limit
by: Charikar, Moses, et al.
Published: (2024)

Pareto-optimal Non-uniform Language Generation
by: Charikar, Moses, et al.
Published: (2025)

Data-driven Discovery with Large Generative Models
by: Majumder, Bodhisattwa Prasad, et al.
Published: (2024)

Calibrating LLMs for Text-to-SQL Parsing by Leveraging Sub-clause Frequencies
by: Liu, Terrance, et al.
Published: (2025)

Operationalizing the Blueprint for an AI Bill of Rights: Recommendations for Practitioners, Researchers, and Policy Makers
by: Oesterling, Alex, et al.
Published: (2024)

AutoEval Done Right: Using Synthetic Data for Model Evaluation
by: Boyeau, Pierre, et al.
Published: (2024)

Languages are Modalities: Cross-Lingual Alignment via Encoder Injection
by: Agarwal, Rajan, et al.
Published: (2025)

TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy
by: Agarwal, Vibhav, et al.
Published: (2024)

Operationalizing AI: Empirical Evidence on MLOps Practices, User Satisfaction, and Organizational Context
by: Pasch, Stefan
Published: (2025)

Towards Compute-Optimal Many-Shot In-Context Learning
by: Golchin, Shahriar, et al.
Published: (2025)

Representation Learning of Structured Data for Medical Foundation Models
by: Dwivedi, Vijay Prakash, et al.
Published: (2024)

AcquisitionSynthesis: Targeted Data Generation using Acquisition Functions
by: Agarwal, Ishika, et al.
Published: (2026)

RAG-Modulo: Solving Sequential Tasks using Experience, Critics, and Language Models
by: Jain, Abhinav, et al.
Published: (2024)

On the Emergence of Thinking in LLMs I: Searching for the Right Intuition
by: Ye, Guanghao, et al.
Published: (2025)

SAEs Are Good for Steering -- If You Select the Right Features
by: Arad, Dana, et al.
Published: (2025)

Perplexity Cannot Always Tell Right from Wrong
by: Veličković, Petar, et al.
Published: (2026)

Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence
by: Agarwal, Bhavik, et al.
Published: (2025)

G-Loss: Graph-Guided Fine-Tuning of Language Models
by: Sharma, Aditya, et al.
Published: (2026)

A Characterization of List Language Identification in the Limit
by: Charikar, Moses, et al.
Published: (2025)