Saved in:
Bibliographic Details
Main Authors: Cai, Yuzhu, Yin, Sheng, Wei, Yuxi, Xu, Chenxin, Mao, Weibo, Juefei-Xu, Felix, Chen, Siheng, Wang, Yanfeng
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.12104
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917881736855552
author Cai, Yuzhu
Yin, Sheng
Wei, Yuxi
Xu, Chenxin
Mao, Weibo
Juefei-Xu, Felix
Chen, Siheng
Wang, Yanfeng
author_facet Cai, Yuzhu
Yin, Sheng
Wei, Yuxi
Xu, Chenxin
Mao, Weibo
Juefei-Xu, Felix
Chen, Siheng
Wang, Yanfeng
contents The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors. However, these advancements bring forth critical ethical concerns, particularly with the misuse of open-source models to generate content that violates societal norms. Addressing this, we introduce Ethical-Lens, a framework designed to facilitate the value-aligned usage of text-to-image tools without necessitating internal model revision. Ethical-Lens ensures value alignment in text-to-image models across toxicity and bias dimensions by refining user commands and rectifying model outputs. Systematic evaluation metrics, combining GPT4-V, HEIM, and FairFace scores, assess alignment capability. Our experiments reveal that Ethical-Lens enhances alignment capabilities to levels comparable with or superior to commercial models like DALLE 3, ensuring user-generated content adheres to ethical standards while maintaining image quality. This study indicates the potential of Ethical-Lens to ensure the sustainable development of open-source text-to-image tools and their beneficial integration into society. Our code is available at https://github.com/yuzhu-cai/Ethical-Lens.
format Preprint
id arxiv_https___arxiv_org_abs_2404_12104
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Cai, Yuzhu
Yin, Sheng
Wei, Yuxi
Xu, Chenxin
Mao, Weibo
Juefei-Xu, Felix
Chen, Siheng
Wang, Yanfeng
Computer Vision and Pattern Recognition
Computation and Language
Machine Learning
The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors. However, these advancements bring forth critical ethical concerns, particularly with the misuse of open-source models to generate content that violates societal norms. Addressing this, we introduce Ethical-Lens, a framework designed to facilitate the value-aligned usage of text-to-image tools without necessitating internal model revision. Ethical-Lens ensures value alignment in text-to-image models across toxicity and bias dimensions by refining user commands and rectifying model outputs. Systematic evaluation metrics, combining GPT4-V, HEIM, and FairFace scores, assess alignment capability. Our experiments reveal that Ethical-Lens enhances alignment capabilities to levels comparable with or superior to commercial models like DALLE 3, ensuring user-generated content adheres to ethical standards while maintaining image quality. This study indicates the potential of Ethical-Lens to ensure the sustainable development of open-source text-to-image tools and their beneficial integration into society. Our code is available at https://github.com/yuzhu-cai/Ethical-Lens.
title Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
topic Computer Vision and Pattern Recognition
Computation and Language
Machine Learning
url https://arxiv.org/abs/2404.12104