Saved in:
Bibliographic Details
Main Authors: Sreelatha, Silpa Vadakkeeveetil, Wang, Dan, Belongie, Serge, Awais, Muhammad, Dutta, Anjan
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.06806
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910274838069248
author Sreelatha, Silpa Vadakkeeveetil
Wang, Dan
Belongie, Serge
Awais, Muhammad
Dutta, Anjan
author_facet Sreelatha, Silpa Vadakkeeveetil
Wang, Dan
Belongie, Serge
Awais, Muhammad
Dutta, Anjan
contents Text-to-image diffusion models achieve impressive generation quality but inherit and amplify training-data biases, skewing coverage of semantic attributes. Prior work addresses this in two ways. Closed-set approaches mitigate biases in predefined fairness categories (e.g., gender, race), assuming socially salient minority attributes are known a priori. Open-set approaches frame the task as bias identification, highlighting majority attributes that dominate outputs. Both overlook a complementary task: uncovering rare or minority features underrepresented in the data distribution (social, cultural, or stylistic) yet still encoded in model representations. We introduce RAIGen, the first framework, to our knowledge, for label-free rare-attribute discovery in diffusion models, requiring no predefined minority categories. RAIGen leverages Matryoshka Sparse Autoencoders and a novel minority metric combining neuron activation frequency with semantic distinctiveness to identify interpretable neurons whose top-activating images reveal underrepresented attributes. Experiments show RAIGen discovers attributes beyond fixed fairness categories in Stable Diffusion, scales to larger models such as SDXL, supports systematic auditing across architectures, and enables targeted amplification of rare attributes during generation. The project page is available at https://vssilpa.github.io/RAIGen_webpage/ .
format Preprint
id arxiv_https___arxiv_org_abs_2602_06806
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle RAIGen: Rare Attribute Identification in Text-to-Image Generative Models
Sreelatha, Silpa Vadakkeeveetil
Wang, Dan
Belongie, Serge
Awais, Muhammad
Dutta, Anjan
Computer Vision and Pattern Recognition
Machine Learning
Text-to-image diffusion models achieve impressive generation quality but inherit and amplify training-data biases, skewing coverage of semantic attributes. Prior work addresses this in two ways. Closed-set approaches mitigate biases in predefined fairness categories (e.g., gender, race), assuming socially salient minority attributes are known a priori. Open-set approaches frame the task as bias identification, highlighting majority attributes that dominate outputs. Both overlook a complementary task: uncovering rare or minority features underrepresented in the data distribution (social, cultural, or stylistic) yet still encoded in model representations. We introduce RAIGen, the first framework, to our knowledge, for label-free rare-attribute discovery in diffusion models, requiring no predefined minority categories. RAIGen leverages Matryoshka Sparse Autoencoders and a novel minority metric combining neuron activation frequency with semantic distinctiveness to identify interpretable neurons whose top-activating images reveal underrepresented attributes. Experiments show RAIGen discovers attributes beyond fixed fairness categories in Stable Diffusion, scales to larger models such as SDXL, supports systematic auditing across architectures, and enables targeted amplification of rare attributes during generation. The project page is available at https://vssilpa.github.io/RAIGen_webpage/ .
title RAIGen: Rare Attribute Identification in Text-to-Image Generative Models
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2602.06806