:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hergert, Lea, Berend, Gábor, Szegedy, Mario, Turan, Gyorgy, Jelasity, Márk
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2511.12728
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Weird Generalization is Weirdly Brittle
by: Wanner, Miriam, et al.
Published: (2026)

LLMs Show Surface-Form Brittleness Under Paraphrase Stress Tests
by: Carranza, Juan Miguel Navarro
Published: (2025)

Membership Inference on LLMs in the Wild
by: Yi, Jiatong, et al.
Published: (2026)

LLM Knowledge is Brittle: Truthfulness Representations Rely on Superficial Resemblance
by: Haller, Patrick, et al.
Published: (2025)

Are Humans as Brittle as Large Language Models?
by: Li, Jiahui, et al.
Published: (2025)

Your Agent is More Brittle Than You Think: Uncovering Indirect Injection Vulnerabilities in Agentic LLMs
by: Zhu, Wenhui, et al.
Published: (2026)

Systematic Diagnosis of Brittle Reasoning in Large Language Models
by: Parupudi, V. S. Raghu
Published: (2025)

A Game for Counting Logic Formula Size and an Application to Linear Orders
by: Fournier, Gregoire, et al.
Published: (2025)

Racka: Efficient Hungarian LLM Adaptation on Academic Infrastructure
by: Csibi, Zsolt, et al.
Published: (2026)

Fast-MIA: Efficient and Scalable Membership Inference for LLMs
by: Takahashi, Hiromu, et al.
Published: (2025)

Probability of Differentiation Reveals Brittleness of Homogeneity Bias in GPT-4
by: Lee, Messi H. J., et al.
Published: (2024)

Generalization or Memorization? Brittleness Testing for Chess-Trained Language Models
by: Tang, Ethan
Published: (2026)

Measuring Bias or Measuring the Task: Understanding the Brittle Nature of LLM Gender Biases
by: Gao, Bufan, et al.
Published: (2025)

Evaluating the Adversarial Robustness of Semantic Segmentation: Trying Harder Pays Off
by: Halmosi, Levente, et al.
Published: (2024)

Tab-MIA: A Benchmark Dataset for Membership Inference Attacks on Tabular Data in LLMs
by: German, Eyal, et al.
Published: (2025)

Detecting Semantic Backdoors in a Mystery Shopping Scenario
by: Berta, Arpad, et al.
Published: (2026)

Brittleness and Promise: Knowledge Graph Based Reward Modeling for Diagnostic Reasoning
by: Khatwani, Saksham, et al.
Published: (2025)

On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models
by: Verma, Mudit, et al.
Published: (2024)

Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models
by: Bortoletto, Matteo, et al.
Published: (2024)

From Patient Consultations to Graphs: Leveraging LLMs for Patient Journey Knowledge Graph Construction
by: Khatib, Hassan S. Al, et al.
Published: (2025)

Connecting Quantum Computing with Classical Stochastic Simulation
by: Blanchet, Jose, et al.
Published: (2025)

SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)
by: Meeus, Matthieu, et al.
Published: (2024)

Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs
by: Mancera, Gonzalo, et al.
Published: (2025)

Frontier LLMs Still Struggle with Simple Reasoning Tasks
by: Malek, Alan, et al.
Published: (2025)

Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
by: Wei, Boyi, et al.
Published: (2024)

Whose Journey Matters? Investigating Identity Biases in Large Language Models (LLMs) for Travel Planning Assistance
by: Ren, Ruiping, et al.
Published: (2024)

A GLR-like Parsing Algorithm for Three-Valued Interpretations of Boolean Grammars with Strong Negation
by: Adrián, Patrik, et al.
Published: (2024)

How Brittle is Agent Safety? Rethinking Agent Risk under Intent Concealment and Task Complexity
by: Ma, Zihan, et al.
Published: (2025)

Unknown Unknowns: Why Hidden Intentions in LLMs Evade Detection
by: Srivastav, Devansh, et al.
Published: (2026)

Machine Text Detectors are Membership Inference Attacks
by: Koike, Ryuto, et al.
Published: (2025)

DocMIA: Document-Level Membership Inference Attacks against DocVQA Models
by: Nguyen, Khanh, et al.
Published: (2025)

SmoothRot: Combining Channel-Wise Scaling and Rotation for Quantization-Friendly LLMs
by: Czakó, Patrik, et al.
Published: (2025)

LLMs for Argument Mining: Detection, Extraction, and Relationship Classification of pre-defined Arguments in Online Comments
by: Guida, Matteo, et al.
Published: (2025)

Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations
by: Xiang, Chaoyi, et al.
Published: (2025)

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
by: Wang, Boshi, et al.
Published: (2024)

Sampling-based Pseudo-Likelihood for Membership Inference Attacks
by: Kaneko, Masahiro, et al.
Published: (2024)

Verification of the Implicit World Model in a Generative Model via Adversarial Sequences
by: Balogh, András, et al.
Published: (2026)

How not to Stitch Representations to Measure Similarity: Task Loss Matching versus Direct Matching
by: Balogh, András, et al.
Published: (2024)

Leveraging Cross-Lingual Transfer Learning in Spoken Named Entity Recognition Systems
by: Benaicha, Moncef, et al.
Published: (2023)

Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
by: Zhou, Jin Peng, et al.
Published: (2024)