Saved in:
| Main Authors: | Crothers, Evan, Viktor, Herna, Japkowicz, Nathalie |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2308.06795 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Don't be Tricked by Iterative Masking
by: Evan Crothers, et al.
Published: (2026)
by: Evan Crothers, et al.
Published: (2026)
Faithfulness Measurable Masked Language Models
by: Madsen, Andreas, et al.
Published: (2023)
by: Madsen, Andreas, et al.
Published: (2023)
Monitoring the evolution of antisemitic discourse on extremist social media using BERT
by: Mustafa, Raza Ul, et al.
Published: (2024)
by: Mustafa, Raza Ul, et al.
Published: (2024)
E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection
by: Mousavi, Ahmad, et al.
Published: (2025)
by: Mousavi, Ahmad, et al.
Published: (2025)
When Self-Belief Misleads: Active Label Acquisition for Reinforcement Learning with Verifiable Rewards
by: Wang, Li, et al.
Published: (2026)
by: Wang, Li, et al.
Published: (2026)
Learning Shortcuts: On the Misleading Promise of NLU in Language Models
by: Bihani, Geetanjali, et al.
Published: (2024)
by: Bihani, Geetanjali, et al.
Published: (2024)
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
by: Matton, Katie, et al.
Published: (2025)
by: Matton, Katie, et al.
Published: (2025)
Exploring the Potential of the Large Language Models (LLMs) in Identifying Misleading News Headlines
by: Rony, Md Main Uddin, et al.
Published: (2024)
by: Rony, Md Main Uddin, et al.
Published: (2024)
Mapping Faithful Reasoning in Language Models
by: Li, Jiazheng, et al.
Published: (2025)
by: Li, Jiazheng, et al.
Published: (2025)
Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms
by: Hanna, Michael, et al.
Published: (2024)
by: Hanna, Michael, et al.
Published: (2024)
FaithLM: Towards Faithful Explanations for Large Language Models
by: Chuang, Yu-Neng, et al.
Published: (2024)
by: Chuang, Yu-Neng, et al.
Published: (2024)
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs
by: Liu, Gabrielle Kaili-May, et al.
Published: (2025)
by: Liu, Gabrielle Kaili-May, et al.
Published: (2025)
Faithful and Robust Local Interpretability for Textual Predictions
by: Lopardo, Gianluigi, et al.
Published: (2023)
by: Lopardo, Gianluigi, et al.
Published: (2023)
Using LLMs to discover emerging coded antisemitic hate-speech in extremist social media
by: Kikkisetti, Dhanush, et al.
Published: (2024)
by: Kikkisetti, Dhanush, et al.
Published: (2024)
Quagmires in SFT-RL Post-Training: When High SFT Scores Mislead and What to Use Instead
by: Kang, Feiyang, et al.
Published: (2025)
by: Kang, Feiyang, et al.
Published: (2025)
Transformer Circuit Faithfulness Metrics are not Robust
by: Miller, Joseph, et al.
Published: (2024)
by: Miller, Joseph, et al.
Published: (2024)
SoftHateBench: Evaluating Moderation Models Against Reasoning-Driven, Policy-Compliant Hostility
by: Su, Xuanyu, et al.
Published: (2026)
by: Su, Xuanyu, et al.
Published: (2026)
RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
by: Yu, Zichun, et al.
Published: (2025)
by: Yu, Zichun, et al.
Published: (2025)
Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine
by: Liu, Fenglin, et al.
Published: (2022)
by: Liu, Fenglin, et al.
Published: (2022)
Representation Deficiency in Masked Language Modeling
by: Meng, Yu, et al.
Published: (2023)
by: Meng, Yu, et al.
Published: (2023)
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
by: Ming, Yifei, et al.
Published: (2024)
by: Ming, Yifei, et al.
Published: (2024)
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
by: Schimanski, Tobias, et al.
Published: (2024)
by: Schimanski, Tobias, et al.
Published: (2024)
Contextual Text Denoising with Masked Language Models
by: Sun, Yifu, et al.
Published: (2019)
by: Sun, Yifu, et al.
Published: (2019)
Scaling Beyond Masked Diffusion Language Models
by: Sahoo, Subham Sekhar, et al.
Published: (2026)
by: Sahoo, Subham Sekhar, et al.
Published: (2026)
Measuring Chain-of-Thought Monitorability Through Faithfulness and Verbosity
by: Meek, Austin, et al.
Published: (2025)
by: Meek, Austin, et al.
Published: (2025)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
by: Becker, Evan, et al.
Published: (2024)
by: Becker, Evan, et al.
Published: (2024)
Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors
by: Wang, Tianchun, et al.
Published: (2024)
by: Wang, Tianchun, et al.
Published: (2024)
New Faithfulness-Centric Interpretability Paradigms for Natural Language Processing
by: Madsen, Andreas
Published: (2024)
by: Madsen, Andreas
Published: (2024)
Measuring What LLMs Think They Do: SHAP Faithfulness and Deployability on Financial Tabular Classification
by: AlMarri, Saeed, et al.
Published: (2025)
by: AlMarri, Saeed, et al.
Published: (2025)
Faithfulness as Information Flow: Evaluating and Training Faithful Chain-of-Thought Reasoning
by: Jia, Jinghan, et al.
Published: (2026)
by: Jia, Jinghan, et al.
Published: (2026)
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
by: Huang, Yanwen, et al.
Published: (2025)
by: Huang, Yanwen, et al.
Published: (2025)
ContextFocus: Activation Steering for Contextual Faithfulness in Large Language Models
by: Anand, Nikhil, et al.
Published: (2026)
by: Anand, Nikhil, et al.
Published: (2026)
Towards Probabilistically-Sound Beam Search with Masked Language Models
by: Brooks, Creston, et al.
Published: (2024)
by: Brooks, Creston, et al.
Published: (2024)
Diffusion-State Policy Optimization for Masked Diffusion Language Models
by: Oba, Daisuke, et al.
Published: (2026)
by: Oba, Daisuke, et al.
Published: (2026)
DOS: Dependency-Oriented Sampler for Masked Diffusion Language Models
by: Zhou, Xueyu, et al.
Published: (2026)
by: Zhou, Xueyu, et al.
Published: (2026)
Reconsidering Positional Supervision in Masked Diffusion Language Model Training
by: Ye, Mengyu, et al.
Published: (2026)
by: Ye, Mengyu, et al.
Published: (2026)
Self-Taught Self-Correction for Small Language Models
by: Moskvoretskii, Viktor, et al.
Published: (2025)
by: Moskvoretskii, Viktor, et al.
Published: (2025)
Soft-Masked Diffusion Language Models
by: Hersche, Michael, et al.
Published: (2025)
by: Hersche, Michael, et al.
Published: (2025)
Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation
by: Young, Richard J.
Published: (2026)
by: Young, Richard J.
Published: (2026)
Masked Mixers for Language Generation and Retrieval
by: Badger, Benjamin L.
Published: (2024)
by: Badger, Benjamin L.
Published: (2024)
Similar Items
-
Don't be Tricked by Iterative Masking
by: Evan Crothers, et al.
Published: (2026) -
Faithfulness Measurable Masked Language Models
by: Madsen, Andreas, et al.
Published: (2023) -
Monitoring the evolution of antisemitic discourse on extremist social media using BERT
by: Mustafa, Raza Ul, et al.
Published: (2024) -
E-CaTCH: Event-Centric Cross-Modal Attention with Temporal Consistency and Class-Imbalance Handling for Misinformation Detection
by: Mousavi, Ahmad, et al.
Published: (2025) -
When Self-Belief Misleads: Active Label Acquisition for Reinforcement Learning with Verifiable Rewards
by: Wang, Li, et al.
Published: (2026)