Guardado en:
| Autores principales: | Dige, Omkar, Singh, Diljot, Yau, Tsz Fung, Zhang, Qixuan, Bolandraftar, Borna, Zhu, Xiaodan, Khattak, Faiza Khan |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2406.13551 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
On The Role of Reasoning in the Identification of Subtle Stereotypes in Natural Language
por: Tian, Jacob-Junqi, et al.
Publicado: (2023)
por: Tian, Jacob-Junqi, et al.
Publicado: (2023)
Soft-prompt Tuning for Large Language Models to Evaluate Bias
por: Tian, Jacob-Junqi, et al.
Publicado: (2023)
por: Tian, Jacob-Junqi, et al.
Publicado: (2023)
Intent-Aware Self-Correction for Mitigating Social Biases in Large Language Models
por: Anantaprayoon, Panatchakorn, et al.
Publicado: (2025)
por: Anantaprayoon, Panatchakorn, et al.
Publicado: (2025)
Bias Vector: Mitigating Biases in Language Models with Task Arithmetic Approach
por: Shirafuji, Daiki, et al.
Publicado: (2024)
por: Shirafuji, Daiki, et al.
Publicado: (2024)
Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments
por: Sumita, Yasuaki, et al.
Publicado: (2024)
por: Sumita, Yasuaki, et al.
Publicado: (2024)
RePAIR: Interactive Machine Unlearning through Prompt-Aware Model Repair
por: Rachapudi, Jagadeesh, et al.
Publicado: (2026)
por: Rachapudi, Jagadeesh, et al.
Publicado: (2026)
FairFlow: Mitigating Dataset Biases through Undecided Learning
por: Cheng, Jiali, et al.
Publicado: (2025)
por: Cheng, Jiali, et al.
Publicado: (2025)
Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions
por: Zhong, Hanyang, et al.
Publicado: (2024)
por: Zhong, Hanyang, et al.
Publicado: (2024)
PersonaMatrix: A Recipe for Persona-Aware Evaluation of Legal Summarization
por: Pang, Tsz Fung, et al.
Publicado: (2025)
por: Pang, Tsz Fung, et al.
Publicado: (2025)
White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs
por: Wan, Yixin, et al.
Publicado: (2024)
por: Wan, Yixin, et al.
Publicado: (2024)
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination
por: Yang, Nakyeong, et al.
Publicado: (2023)
por: Yang, Nakyeong, et al.
Publicado: (2023)
Hierarchical Federated Unlearning for Large Language Models
por: Zhong, Yisheng, et al.
Publicado: (2025)
por: Zhong, Yisheng, et al.
Publicado: (2025)
Red-Teaming for Inducing Societal Bias in Large Language Models
por: Luo, Chu Fei, et al.
Publicado: (2024)
por: Luo, Chu Fei, et al.
Publicado: (2024)
Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearning
por: Zhao, Yang, et al.
Publicado: (2024)
por: Zhao, Yang, et al.
Publicado: (2024)
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
por: Feng, Xiaohua, et al.
Publicado: (2025)
por: Feng, Xiaohua, et al.
Publicado: (2025)
Self-Blinding and Counterfactual Self-Simulation Mitigate Biases and Sycophancy in Large Language Models
por: Christian, Brian, et al.
Publicado: (2026)
por: Christian, Brian, et al.
Publicado: (2026)
FairPy: A Toolkit for Evaluation of Prediction Biases and their Mitigation in Large Language Models
por: Viswanath, Hrishikesh, et al.
Publicado: (2023)
por: Viswanath, Hrishikesh, et al.
Publicado: (2023)
Identifying and Mitigating Social Bias Knowledge in Language Models
por: Chen, Ruizhe, et al.
Publicado: (2024)
por: Chen, Ruizhe, et al.
Publicado: (2024)
Deep Contrastive Unlearning for Language Models
por: He, Estrid, et al.
Publicado: (2025)
por: He, Estrid, et al.
Publicado: (2025)
Machine Unlearning in Large Language Models
por: Gundavarapu, Saaketh Koundinya, et al.
Publicado: (2024)
por: Gundavarapu, Saaketh Koundinya, et al.
Publicado: (2024)
OFFSIDE: Benchmarking Unlearning Misinformation in Multimodal Large Language Models
por: Zheng, Hao, et al.
Publicado: (2025)
por: Zheng, Hao, et al.
Publicado: (2025)
Fine-Tuning Language Models with Differential Privacy through Adaptive Noise Allocation
por: Li, Xianzhi, et al.
Publicado: (2024)
por: Li, Xianzhi, et al.
Publicado: (2024)
CURE: Controlled Unlearning for Robust Embeddings -- Mitigating Conceptual Shortcuts in Pre-Trained Language Models
por: Kocak, Aysenur, et al.
Publicado: (2025)
por: Kocak, Aysenur, et al.
Publicado: (2025)
Exploring the Role of Reasoning Structures for Constructing Proofs in Multi-Step Natural Language Reasoning with Large Language Models
por: Zheng, Zi'ou, et al.
Publicado: (2024)
por: Zheng, Zi'ou, et al.
Publicado: (2024)
Promptception: How Sensitive Are Large Multimodal Models to Prompts?
por: Ismithdeen, Mohamed Insaf, et al.
Publicado: (2025)
por: Ismithdeen, Mohamed Insaf, et al.
Publicado: (2025)
Chat-TS: Enhancing Multi-Modal Reasoning Over Time-Series and Natural Language Data
por: Quinlan, Paul, et al.
Publicado: (2025)
por: Quinlan, Paul, et al.
Publicado: (2025)
Large Language Models are Biased Because They Are Large Language Models
por: Resnik, Philip
Publicado: (2024)
por: Resnik, Philip
Publicado: (2024)
Towards Reasoning-Preserving Unlearning in Multimodal Large Language Models
por: Li, Hongji, et al.
Publicado: (2025)
por: Li, Hongji, et al.
Publicado: (2025)
Understanding the Dilemma of Unlearning for Large Language Models
por: Zhang, Qingjie, et al.
Publicado: (2025)
por: Zhang, Qingjie, et al.
Publicado: (2025)
Learning and Unlearning of Fabricated Knowledge in Language Models
por: Sun, Chen, et al.
Publicado: (2024)
por: Sun, Chen, et al.
Publicado: (2024)
Invisible Influences: Investigating Implicit Intersectional Biases through Persona Engineering in Large Language Models
por: Arimanda, Nandini, et al.
Publicado: (2026)
por: Arimanda, Nandini, et al.
Publicado: (2026)
Large Language Models Develop Novel Social Biases Through Adaptive Exploration
por: Wu, Addison J., et al.
Publicado: (2025)
por: Wu, Addison J., et al.
Publicado: (2025)
Large Language Model Unlearning
por: Yao, Yuanshun, et al.
Publicado: (2023)
por: Yao, Yuanshun, et al.
Publicado: (2023)
ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models
por: Guang, Jiahui, et al.
Publicado: (2026)
por: Guang, Jiahui, et al.
Publicado: (2026)
CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept
por: Wu, YuXuan, et al.
Publicado: (2024)
por: Wu, YuXuan, et al.
Publicado: (2024)
Mitigating Memorization In Language Models
por: Sakarvadia, Mansi, et al.
Publicado: (2024)
por: Sakarvadia, Mansi, et al.
Publicado: (2024)
ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models
por: Lin, Yujie, et al.
Publicado: (2026)
por: Lin, Yujie, et al.
Publicado: (2026)
Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models
por: Tang, Haoyu, et al.
Publicado: (2024)
por: Tang, Haoyu, et al.
Publicado: (2024)
Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding
por: Quinlan, Paul, et al.
Publicado: (2026)
por: Quinlan, Paul, et al.
Publicado: (2026)
Unlearning Traces the Influential Training Data of Language Models
por: Isonuma, Masaru, et al.
Publicado: (2024)
por: Isonuma, Masaru, et al.
Publicado: (2024)
Ejemplares similares
-
On The Role of Reasoning in the Identification of Subtle Stereotypes in Natural Language
por: Tian, Jacob-Junqi, et al.
Publicado: (2023) -
Soft-prompt Tuning for Large Language Models to Evaluate Bias
por: Tian, Jacob-Junqi, et al.
Publicado: (2023) -
Intent-Aware Self-Correction for Mitigating Social Biases in Large Language Models
por: Anantaprayoon, Panatchakorn, et al.
Publicado: (2025) -
Bias Vector: Mitigating Biases in Language Models with Task Arithmetic Approach
por: Shirafuji, Daiki, et al.
Publicado: (2024) -
Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments
por: Sumita, Yasuaki, et al.
Publicado: (2024)