Saved in:
| Main Authors: | Drechsel, Jonathan, Herbold, Steffen |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.01406 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The GRADIEND Python Package: An End-to-End System for Gradient-Based Feature Learning
by: Drechsel, Jonathan, et al.
Published: (2026)
by: Drechsel, Jonathan, et al.
Published: (2026)
Understanding or Memorizing? A Case Study of German Definite Articles in Language Models
by: Drechsel, Jonathan, et al.
Published: (2026)
by: Drechsel, Jonathan, et al.
Published: (2026)
MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training
by: Drechsel, Jonathan, et al.
Published: (2025)
by: Drechsel, Jonathan, et al.
Published: (2025)
Large Language Models can impersonate politicians and other public figures
by: Herbold, Steffen, et al.
Published: (2024)
by: Herbold, Steffen, et al.
Published: (2024)
SortBench: Benchmarking LLMs based on their ability to sort lists
by: Herbold, Steffen
Published: (2025)
by: Herbold, Steffen
Published: (2025)
Semantic similarity prediction is better than other semantic similarity measures
by: Herbold, Steffen
Published: (2023)
by: Herbold, Steffen
Published: (2023)
FairFlow: Mitigating Dataset Biases through Undecided Learning
by: Cheng, Jiali, et al.
Published: (2025)
by: Cheng, Jiali, et al.
Published: (2025)
A Formal Framework for Uncertainty Analysis of Text Generation with Large Language Models
by: Herbold, Steffen, et al.
Published: (2026)
by: Herbold, Steffen, et al.
Published: (2026)
On the Hidden Objective Biases of Group-based Reinforcement Learning
by: Fontana, Aleksandar, et al.
Published: (2026)
by: Fontana, Aleksandar, et al.
Published: (2026)
BiasGym: A Simple and Generalizable Framework for Analyzing and Removing Biases through Elicitation
by: Islam, Sekh Mainul, et al.
Published: (2025)
by: Islam, Sekh Mainul, et al.
Published: (2025)
Do Large Language Models Show Biases in Causal Learning?
by: Carro, Maria Victoria, et al.
Published: (2024)
by: Carro, Maria Victoria, et al.
Published: (2024)
Inductive Biases for Zero-shot Systematic Generalization in Language-informed Reinforcement Learning
by: Dijujin, Negin Hashemi, et al.
Published: (2025)
by: Dijujin, Negin Hashemi, et al.
Published: (2025)
Demo: Statistically Significant Results On Biases and Errors of LLMs Do Not Guarantee Generalizable Results
by: Liu, Jonathan, et al.
Published: (2025)
by: Liu, Jonathan, et al.
Published: (2025)
Superscopes: Amplifying Internal Feature Representations for Language Model Interpretation
by: Jacobi, Jonathan, et al.
Published: (2025)
by: Jacobi, Jonathan, et al.
Published: (2025)
Relative Value Biases in Large Language Models
by: Hayes, William M., et al.
Published: (2024)
by: Hayes, William M., et al.
Published: (2024)
Large Language Models are Biased Reinforcement Learners
by: Hayes, William M., et al.
Published: (2024)
by: Hayes, William M., et al.
Published: (2024)
Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models
by: Feng, Duanyu, et al.
Published: (2023)
by: Feng, Duanyu, et al.
Published: (2023)
Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases
by: Hahm, Dongyoon, et al.
Published: (2026)
by: Hahm, Dongyoon, et al.
Published: (2026)
Agentifying Patient Dynamics within LLMs through Interacting with Clinical World Model
by: Wu, Minghao, et al.
Published: (2026)
by: Wu, Minghao, et al.
Published: (2026)
Exploiting Synergistic Cognitive Biases to Bypass Safety in LLMs
by: Yang, Xikang, et al.
Published: (2025)
by: Yang, Xikang, et al.
Published: (2025)
Self-Speculative Biased Decoding for Faster Re-Translation
by: Zeng, Linxiao, et al.
Published: (2025)
by: Zeng, Linxiao, et al.
Published: (2025)
Text Injection for Neural Contextual Biasing
by: Meng, Zhong, et al.
Published: (2024)
by: Meng, Zhong, et al.
Published: (2024)
Large Language Models are Geographically Biased
by: Manvi, Rohin, et al.
Published: (2024)
by: Manvi, Rohin, et al.
Published: (2024)
Heuristics and Biases in AI Decision-Making: Implications for Responsible AGI
by: Saeedi, Payam, et al.
Published: (2024)
by: Saeedi, Payam, et al.
Published: (2024)
Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication
by: Kuciński, Łukasz, et al.
Published: (2021)
by: Kuciński, Łukasz, et al.
Published: (2021)
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
by: Hu, Michael Y., et al.
Published: (2025)
by: Hu, Michael Y., et al.
Published: (2025)
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination
by: Yang, Nakyeong, et al.
Published: (2023)
by: Yang, Nakyeong, et al.
Published: (2023)
BiasJailbreak:Analyzing Ethical Biases and Jailbreak Vulnerabilities in Large Language Models
by: Lee, Isack, et al.
Published: (2024)
by: Lee, Isack, et al.
Published: (2024)
Long-Short Distance Graph Neural Networks and Improved Curriculum Learning for Emotion Recognition in Conversation
by: Li, Xinran, et al.
Published: (2025)
by: Li, Xinran, et al.
Published: (2025)
Reward Models Inherit Value Biases from Pretraining
by: Christian, Brian, et al.
Published: (2026)
by: Christian, Brian, et al.
Published: (2026)
Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs
by: Itzhak, Itay, et al.
Published: (2025)
by: Itzhak, Itay, et al.
Published: (2025)
FairPy: A Toolkit for Evaluation of Prediction Biases and their Mitigation in Large Language Models
by: Viswanath, Hrishikesh, et al.
Published: (2023)
by: Viswanath, Hrishikesh, et al.
Published: (2023)
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
by: Opedal, Andreas, et al.
Published: (2024)
by: Opedal, Andreas, et al.
Published: (2024)
Switchable Decision: Dynamic Neural Generation Networks
by: Zhang, Shujian, et al.
Published: (2024)
by: Zhang, Shujian, et al.
Published: (2024)
Can Small-Scale Data Poisoning Exacerbate Dialect-Linked Biases in Large Language Models?
by: Abbas, Chaymaa, et al.
Published: (2025)
by: Abbas, Chaymaa, et al.
Published: (2025)
Laissez-Faire Harms: Algorithmic Biases in Generative Language Models
by: Shieh, Evan, et al.
Published: (2024)
by: Shieh, Evan, et al.
Published: (2024)
Enhancing Rare Codes via Probability-Biased Directed Graph Attention for Long-Tail ICD Coding
by: Chen, Tianlei, et al.
Published: (2025)
by: Chen, Tianlei, et al.
Published: (2025)
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
by: Huang, Jerry, et al.
Published: (2024)
by: Huang, Jerry, et al.
Published: (2024)
Interactive Training: Feedback-Driven Neural Network Optimization
by: Zhang, Wentao, et al.
Published: (2025)
by: Zhang, Wentao, et al.
Published: (2025)
Empirical Capacity Model for Self-Attention Neural Networks
by: Härmä, Aki, et al.
Published: (2024)
by: Härmä, Aki, et al.
Published: (2024)
Similar Items
-
The GRADIEND Python Package: An End-to-End System for Gradient-Based Feature Learning
by: Drechsel, Jonathan, et al.
Published: (2026) -
Understanding or Memorizing? A Case Study of German Definite Articles in Language Models
by: Drechsel, Jonathan, et al.
Published: (2026) -
MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training
by: Drechsel, Jonathan, et al.
Published: (2025) -
Large Language Models can impersonate politicians and other public figures
by: Herbold, Steffen, et al.
Published: (2024) -
SortBench: Benchmarking LLMs based on their ability to sort lists
by: Herbold, Steffen
Published: (2025)