:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Thompson, T. Ben, Straznickas, Zygimantas, Sklar, Michael
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2402.01702
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FLRT: Fluent Student-Teacher Redteaming
by: Thompson, T. Ben, et al.
Published: (2024)

Fluent but Unfeeling: The Emotional Blind Spots of Language Models
by: Shu, Bangzhao, et al.
Published: (2025)

Fluent Alignment with Disfluent Judges: Post-training for Lower-resource Languages
by: Samuel, David, et al.
Published: (2025)

Algorithmic progress in language models
by: Ho, Anson, et al.
Published: (2024)

LIME-LLM: Probing Models with Fluent Counterfactuals, Not Broken Text
by: Mihaila, George, et al.
Published: (2026)

Fluent but Foreign: Even Regional LLMs Lack Cultural Alignment
by: Agarwal, Dhruv, et al.
Published: (2025)

HyperSteer: Activation Steering at Scale with Hypernetworks
by: Sun, Jiuding, et al.
Published: (2025)

Dissociating language and thought in large language models
by: Mahowald, Kyle, et al.
Published: (2023)

Auxiliary task demands mask the capabilities of smaller language models
by: Hu, Jennifer, et al.
Published: (2024)

HyperDAS: Towards Automating Mechanistic Interpretability with Hypernetworks
by: Sun, Jiuding, et al.
Published: (2025)

On the attribution of confidence to large language models
by: Keeling, Geoff, et al.
Published: (2024)

Large language models and linguistic intentionality
by: Grindrod, Jumbly
Published: (2024)

Do language models practice what they preach? Examining language ideologies about gendered language reform encoded in LLMs
by: Watson, Julia, et al.
Published: (2024)

A survey of textual cyber abuse detection using cutting-edge language models and large language models
by: Diaz-Garcia, Jose A., et al.
Published: (2025)

Bringing legal knowledge to the public by constructing a legal question bank using large-scale pre-trained language model
by: Yuan, Mingruo, et al.
Published: (2025)

Humans overrely on overconfident language models, across languages
by: Rathi, Neil, et al.
Published: (2025)

Linguistic traces of stochastic empathy in language models
by: Kleinberg, Bennett, et al.
Published: (2024)

Infusing clinical knowledge into tokenisers for language models
by: Hasan, Abul, et al.
Published: (2024)

CONFLARE: CONFormal LArge language model REtrieval
by: Rouzrokh, Pouria, et al.
Published: (2024)

What do language models model? Transformers, automata, and the format of thought
by: Klein, Colin
Published: (2025)

The language of time: a language model perspective on time-series foundation models
by: Xie, Yi, et al.
Published: (2025)

Uncovering inequalities in new knowledge learning by large language models across different languages
by: Wang, Chenglong, et al.
Published: (2025)

Multi-round jailbreak attack on large language models
by: Zhou, Yihua, et al.
Published: (2024)

The 20 questions game to distinguish large language models
by: Richardeau, Gurvan, et al.
Published: (2024)

Large language model empowered participatory urban planning
by: Zhou, Zhilun, et al.
Published: (2024)

Evidence of a log scaling law for political persuasion with large language models
by: Hackenburg, Kobi, et al.
Published: (2024)

Slm-mux: Orchestrating small language models for reasoning
by: Wang, Chenyu, et al.
Published: (2025)

Quantifying non deterministic drift in large language models
by: Nicholson, Claire
Published: (2026)

Can large language models build causal graphs?
by: Long, Stephanie, et al.
Published: (2023)

Response: Emergent analogical reasoning in large language models
by: Hodel, Damian, et al.
Published: (2023)

Large language models struggle with ethnographic text annotation
by: Goodall, Leonardo S., et al.
Published: (2026)

Scaling laws for language encoding models in fMRI
by: Antonello, Richard, et al.
Published: (2023)

Just-in-time and distributed task representations in language models
by: Li, Yuxuan, et al.
Published: (2025)

Alignment faking in large language models
by: Greenblatt, Ryan, et al.
Published: (2024)

Cognitive models can reveal interpretable value trade-offs in language models
by: Murthy, Sonia K., et al.
Published: (2025)

Code-enabled language models can outperform reasoning models on diverse tasks
by: Zhang, Cedegao E., et al.
Published: (2025)

Retrieval-augmented reasoning with lean language models
by: Chan, Ryan Sze-Yin, et al.
Published: (2025)

Do Chinese models speak Chinese languages?
by: Wen-Yi, Andrea W, et al.
Published: (2025)

Failure of contextual invariance in large language models
by: Kumar, Sagar, et al.
Published: (2026)

Large language models in medicine: the potentials and pitfalls
by: Omiye, Jesutofunmi A., et al.
Published: (2023)