:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bai, Xiaoyan, Pres, Itamar, Deng, Yuntian, Tan, Chenhao, Shieber, Stuart, Viégas, Fernanda, Wattenberg, Martin, Lee, Andrew
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.00184
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step
by: Deng, Yuntian, et al.
Published: (2024)

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
by: Lee, Andrew, et al.
Published: (2024)

Tensor Product Representation Probes Reveal Shared Structure Across Linear Directions
by: Lee, Andrew, et al.
Published: (2026)

Relational Composition in Neural Networks: A Survey and Call to Action
by: Wattenberg, Martin, et al.
Published: (2024)

What Does it Mean for a Neural Network to Learn a "World Model"?
by: Li, Kenneth, et al.
Published: (2025)

Dialogue Action Tokens: Steering Language Models in Goal-Directed Dialogue with a Multi-Turn Planner
by: Li, Kenneth, et al.
Published: (2024)

Decomposing Query-Key Feature Interactions Using Contrastive Covariances
by: Lee, Andrew, et al.
Published: (2026)

When Bad Data Leads to Good Models
by: Li, Kenneth, et al.
Published: (2025)

Shared Global and Local Geometry of Language Model Embeddings
by: Lee, Andrew, et al.
Published: (2025)

The Geometry of Self-Verification in a Task-Specific Reasoning Model
by: Lee, Andrew, et al.
Published: (2025)

Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
by: Li, Kenneth, et al.
Published: (2023)

Chronotome: Real-Time Topic Modeling for Streaming Embedding Spaces
by: Lim, Matte, et al.
Published: (2025)

Towards Reliable Evaluation of Behavior Steering Interventions in LLMs
by: Pres, Itamar, et al.
Published: (2024)

AbsenceBench: Language Models Can't Tell What's Missing
by: Fu, Harvey Yiyun, et al.
Published: (2025)

Competition Dynamics Shape Algorithmic Phases of In-Context Learning
by: Park, Core Francisco, et al.
Published: (2024)

Why AI Can't Simulate Extreme Decision-Making
by: Rosehill, Daniel, et al.
Published: (2026)

Why Can't I Ever Find Anything in the Library?
by: Radford, Neil, et al.
Published: (1983)

Why I Can't Create a Learning Center
by: Miller, Rosalind
Published: (1975)

A National Digital Library for Science, Mathematics, Engineering, and Technology Education.
by: Wattenberg, Frank
Published: (1998)

Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
by: Li, Kenneth, et al.
Published: (2022)

"We are currently clean on OPSEC": Why JD Can't Encrypt
by: Chiodo, Maurice, et al.
Published: (2026)

Why the Center Can't Hold: A Diagnosis of Puritanized America
by: O’Neill, Tom
Published: (2019)

Why I Can't Read Wallace Stegner, and Other Essays
by: Cook-Lynn, Elizabeth
Published: (2025)

Son of Why Johnny Can't Read and What You Do About It, by Hugo Flesch, Son of Rudolf Flesch, Author of Son of Why Johnny Can't Read and...
by: Flesch, Hugo
Published: (1970)

Measuring and Controlling Instruction (In)Stability in Language Model Dialogs
by: Li, Kenneth, et al.
Published: (2024)

Concept Incongruence: An Exploration of Time and Death in Role Playing
by: Bai, Xiaoyan, et al.
Published: (2025)

Know Thyself? On the Incapability and Implications of AI Self-Recognition
by: Bai, Xiaoyan, et al.
Published: (2025)

Time Blindness: Why Video-Language Models Can't See What Humans Can?
by: Upadhyay, Ujjwal, et al.
Published: (2025)

Can’t Touch This
Published: (2024)

Story Ribbons: Reimagining Storyline Visualizations with Large Language Models
by: Yeh, Catherine, et al.
Published: (2025)

Why AI Harms Can't Be Fixed One Identity at a Time: What 5300 Incident Reports Reveal About Intersectionality
by: Bogucka, Edyta, et al.
Published: (2026)

They Can't Hear Us Does Not Mean We Can't Serve Them.
by: McDaniel, Julie Ann
Published: (1992)

Ep. 178: The Skywave Secret: Why Aviation Can't Quit HF Radio
by: Rosehill, Daniel, et al.
Published: (2026)

Why Neural Structural Obfuscation Can't Kill White-Box Watermarks for Good!
by: Jiang, Yanna, et al.
Published: (2026)

Why We Can't Afford to Turn Our Backs on Equity, Diversity and Inclusion
by: Phyllis Richards, et al.
Published: (2024)

On the Mirage of Long-Range Dependency, with an Application to Integer Multiplication
by: Wei, Zichao
Published: (2026)

Analysis of the Stellar Occultations During the Unprecedented Long-Duration Flare
by: Bicz, Kamil, et al.
Published: (2024)

Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?
by: Balepur, Nishant, et al.
Published: (2024)

The Story is Not the Science: Execution-Grounded Evaluation of Mechanistic Interpretability Research
by: Bai, Xiaoyan, et al.
Published: (2026)

Ep. 1086: Why AI Can't Stop Talking About Second Order Effects
by: Rosehill, Daniel, et al.
Published: (2026)