:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Crothers, Evan, Viktor, Herna, Japkowicz, Nathalie
Format:	Preprint
Published:	2023
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2308.06795
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Don't be Tricked by Iterative Masking
by: Evan Crothers, et al.
Published: (2026)

Faithfulness Measurable Masked Language Models
by: Madsen, Andreas, et al.
Published: (2023)

Monitoring the evolution of antisemitic discourse on extremist social media using BERT
by: Mustafa, Raza Ul, et al.
Published: (2024)

E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection
by: Mousavi, Ahmad, et al.
Published: (2025)

When Self-Belief Misleads: Active Label Acquisition for Reinforcement Learning with Verifiable Rewards
by: Wang, Li, et al.
Published: (2026)

Learning Shortcuts: On the Misleading Promise of NLU in Language Models
by: Bihani, Geetanjali, et al.
Published: (2024)

Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
by: Matton, Katie, et al.
Published: (2025)

Exploring the Potential of the Large Language Models (LLMs) in Identifying Misleading News Headlines
by: Rony, Md Main Uddin, et al.
Published: (2024)

Mapping Faithful Reasoning in Language Models
by: Li, Jiazheng, et al.
Published: (2025)

Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms
by: Hanna, Michael, et al.
Published: (2024)

FaithLM: Towards Faithful Explanations for Large Language Models
by: Chuang, Yu-Neng, et al.
Published: (2024)

MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
by: Liu, Gabrielle Kaili-May, et al.
Published: (2025)

Faithful and Robust Local Interpretability for Textual Predictions
by: Lopardo, Gianluigi, et al.
Published: (2023)

Using LLMs to discover emerging coded antisemitic hate-speech in extremist social media
by: Kikkisetti, Dhanush, et al.
Published: (2024)

Quagmires in SFT-RL Post-Training: When High SFT Scores Mislead and What to Use Instead
by: Kang, Feiyang, et al.
Published: (2025)

Transformer Circuit Faithfulness Metrics are not Robust
by: Miller, Joseph, et al.
Published: (2024)

SoftHateBench: Evaluating Moderation Models Against Reasoning-Driven, Policy-Compliant Hostility
by: Su, Xuanyu, et al.
Published: (2026)

RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
by: Yu, Zichun, et al.
Published: (2025)

Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine
by: Liu, Fenglin, et al.
Published: (2022)

Representation Deficiency in Masked Language Modeling
by: Meng, Yu, et al.
Published: (2023)

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
by: Ming, Yifei, et al.
Published: (2024)

Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
by: Schimanski, Tobias, et al.
Published: (2024)

Contextual Text Denoising with Masked Language Models
by: Sun, Yifu, et al.
Published: (2019)

Scaling Beyond Masked Diffusion Language Models
by: Sahoo, Subham Sekhar, et al.
Published: (2026)

Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity
by: Meek, Austin, et al.
Published: (2025)

Cycles of Thought: Measuring LLM Confidence through Stable Explanations
by: Becker, Evan, et al.
Published: (2024)

Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
by: Wang, Tianchun, et al.
Published: (2024)

New Faithfulness-Centric Interpretability Paradigms for Natural Language Processing
by: Madsen, Andreas
Published: (2024)

Measuring What LLMs Think They Do: SHAP Faithfulness and Deployability on Financial Tabular Classification
by: AlMarri, Saeed, et al.
Published: (2025)

Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning
by: Jia, Jinghan, et al.
Published: (2026)

Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
by: Huang, Yanwen, et al.
Published: (2025)

ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models
by: Anand, Nikhil, et al.
Published: (2026)

Towards Probabilistically-Sound Beam Search with Masked Language Models
by: Brooks, Creston, et al.
Published: (2024)

Diffusion-State Policy Optimization for Masked Diffusion Language Models
by: Oba, Daisuke, et al.
Published: (2026)

DOS: Dependency-Oriented Sampler for Masked Diffusion Language Models
by: Zhou, Xueyu, et al.
Published: (2026)

Reconsidering Positional Supervision in Masked Diffusion Language Model Training
by: Ye, Mengyu, et al.
Published: (2026)

Self-Taught Self-Correction for Small Language Models
by: Moskvoretskii, Viktor, et al.
Published: (2025)

Soft-Masked Diffusion Language Models
by: Hersche, Michael, et al.
Published: (2025)

Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation
by: Young, Richard J.
Published: (2026)

Masked Mixers for Language Generation and Retrieval
by: Badger, Benjamin L.
Published: (2024)