:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Oh, Nathaniel, Attie, Paul
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.26829
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

First Hallucination Tokens Are Different from Conditional Ones
by: Snel, Jakob, et al.
Published: (2025)

Predicting LLM Correctness in Prosthodontics Using Metadata and Hallucination Signals
by: Susanto, Lucky, et al.
Published: (2025)

Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated
by: Foerster, Hanna, et al.
Published: (2025)

Sea-cret Agents: Maritime Abduction for Region Generation to Expose Dark Vessel Trajectories
by: Bavikadi, Divyagna, et al.
Published: (2025)

Making AI-Assisted Grant Evaluation Auditable without Exposing the Model
by: Bicakci, Kemal
Published: (2026)

One Filter to Deploy Them All: Robust Safety for Quadrupedal Navigation in Unknown Environments
by: Lin, Albert, et al.
Published: (2024)

Fantastic Pretraining Optimizers and Where to Find Them
by: Wen, Kaiyue, et al.
Published: (2025)

Low Rank Gradients and Where to Find Them
by: Sonthalia, Rishi, et al.
Published: (2025)

Computational Safety for Generative AI: A Signal Processing Perspective
by: Chen, Pin-Yu
Published: (2025)

GNN Explanations that do not Explain and How to find Them
by: Azzolin, Steve, et al.
Published: (2026)

What Cohort INRs Encode and Where to Freeze Them
by: Sideri-Lampretsa, Vasiliki, et al.
Published: (2026)

Hidden Error Awareness in Chain-of-Thought Reasoning: The Signal Is Diagnostic, Not Causal
by: Yuan, Aojie, et al.
Published: (2026)

Explaining Predictive Uncertainty by Exposing Second-Order Effects
by: Bley, Florian, et al.
Published: (2024)

Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis
by: Zhang, Haoyu, et al.
Published: (2026)

How to Square Tensor Networks and Circuits Without Squaring Them
by: Loconte, Lorenzo, et al.
Published: (2025)

Transcendence: Generative Models Can Outperform The Experts That Train Them
by: Zhang, Edwin, et al.
Published: (2024)

Calibrated Language Models and How to Find Them with Label Smoothing
by: Huang, Jerry, et al.
Published: (2025)

REBEL: Hidden Knowledge Recovery via Evolutionary-Based Evaluation Loop
by: Rybak, Patryk, et al.
Published: (2026)

When Privacy Isn't Synthetic: Hidden Data Leakage in Generative AI Models
by: Mustaqim, S. M., et al.
Published: (2025)

Sparsest Models Elude Pruning: An Exposé of Pruning's Current Capabilities
by: Zhang, Stephen, et al.
Published: (2024)

Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable
by: Huang, Tiansheng, et al.
Published: (2025)

Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation
by: Ghaznavi, Mahdi, et al.
Published: (2024)

Safety Alignment Can Be Not Superficial With Explicit Safety Signals
by: Li, Jianwei, et al.
Published: (2025)

Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning
by: Kim, Dongkwan, et al.
Published: (2022)

When Stability Fails: Hidden Failure Modes Of LLMS in Data-Constrained Scientific Decision-Making
by: Riasat, Nazia
Published: (2026)

One Wave To Explain Them All: A Unifying Perspective On Feature Attribution
by: Kasmi, Gabriel, et al.
Published: (2024)

Conformal Validity Guarantees Exist for Any Data Distribution (and How to Find Them)
by: Prinster, Drew, et al.
Published: (2024)

Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them
by: Jin, Jiahe, et al.
Published: (2025)

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment
by: Huang, Audrey, et al.
Published: (2025)

CHILL at SemEval-2025 Task 2: You Can't Just Throw Entities and Hope -- Make Your LLM to Get Them Right
by: Lee, Jaebok, et al.
Published: (2025)

ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
by: Oh, Jio, et al.
Published: (2024)

Exposing propaganda: an analysis of stylistic cues comparing human annotations and machine classification
by: Faye, Géraud, et al.
Published: (2024)

The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering
by: Li, Zhuowei, et al.
Published: (2025)

What LLMs Think When You Don't Tell Them What to Think About?
by: Kwon, Yongchan, et al.
Published: (2026)

On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A Gradient-Norm Perspective
by: Xie, Zeke, et al.
Published: (2020)

Toward a Metrology for Artificial Intelligence: Hidden-Rule Environments and Reinforcement Learning
by: Mathew, Christo, et al.
Published: (2025)

Learning with Hidden Factorial Structure
by: Arnal, Charles, et al.
Published: (2024)

The Illusion of Reasoning: Exposing Evasive Data Contamination in LLMs via Zero-CoT Truncation
by: Lan, Yifan, et al.
Published: (2026)

StyleShield: Exposing the Fragility of AIGC Detectors through Continuous Controllable Style Transfer
by: Zheng, Guantian
Published: (2026)

The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering
by: Zhou, Yefan, et al.
Published: (2026)