Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zeng, Yi, Zhao, Feifei, Wang, Yuwei, Lu, Enmeng, Yang, Yaodong, Wang, Lei, Liu, Chao, Liang, Yitao, Zhao, Dongcheng, Han, Bing, Tong, Haibo, Liang, Yao, Liang, Dongqi, Sun, Kang, Chen, Boyuan, Fan, Jinyu
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2504.17404
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911026976391168
author	Zeng, Yi Zhao, Feifei Wang, Yuwei Lu, Enmeng Yang, Yaodong Wang, Lei Liu, Chao Liang, Yitao Zhao, Dongcheng Han, Bing Tong, Haibo Liang, Yao Liang, Dongqi Sun, Kang Chen, Boyuan Fan, Jinyu
author_facet	Zeng, Yi Zhao, Feifei Wang, Yuwei Lu, Enmeng Yang, Yaodong Wang, Lei Liu, Chao Liang, Yitao Zhao, Dongcheng Han, Bing Tong, Haibo Liang, Yao Liang, Dongqi Sun, Kang Chen, Boyuan Fan, Jinyu
contents	As Artificial Intelligence (AI) advances toward Artificial General Intelligence (AGI) and eventually Artificial Superintelligence (ASI), it may potentially surpass human control, deviate from human values, and even lead to irreversible catastrophic consequences in extreme cases. This looming risk underscores the critical importance of the "superalignment" problem - ensuring that AI systems which are much smarter than humans, remain aligned with human (compatible) intentions and values. While current scalable oversight and weak-to-strong generalization methods demonstrate certain applicability, they exhibit fundamental flaws in addressing the superalignment paradigm - notably, the unidirectional imposition of human values cannot accommodate superintelligence's autonomy or ensure AGI/ASI's stable learning. We contend that the values for sustainable symbiotic society should be co-shaped by humans and living AI together, achieving "Super Co-alignment." Guided by this vision, we propose a concrete framework that integrates external oversight and intrinsic proactive alignment. External oversight superalignment should be grounded in human-centered ultimate decision, supplemented by interpretable automated evaluation and correction, to achieve continuous alignment with humanity's evolving values. Intrinsic proactive superalignment is rooted in a profound understanding of the Self, others, and society, integrating self-awareness, self-reflection, and empathy to spontaneously infer human intentions, distinguishing good from evil and proactively prioritizing human well-being. The integration of externally-driven oversight with intrinsically-driven proactive alignment will co-shape symbiotic values and rules through iterative human-ASI co-alignment, paving the way for achieving safe and beneficial AGI and ASI for good, for human, and for a symbiotic ecology.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_17404
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Super Co-alignment of Human and AI for Sustainable Symbiotic Society Zeng, Yi Zhao, Feifei Wang, Yuwei Lu, Enmeng Yang, Yaodong Wang, Lei Liu, Chao Liang, Yitao Zhao, Dongcheng Han, Bing Tong, Haibo Liang, Yao Liang, Dongqi Sun, Kang Chen, Boyuan Fan, Jinyu Artificial Intelligence As Artificial Intelligence (AI) advances toward Artificial General Intelligence (AGI) and eventually Artificial Superintelligence (ASI), it may potentially surpass human control, deviate from human values, and even lead to irreversible catastrophic consequences in extreme cases. This looming risk underscores the critical importance of the "superalignment" problem - ensuring that AI systems which are much smarter than humans, remain aligned with human (compatible) intentions and values. While current scalable oversight and weak-to-strong generalization methods demonstrate certain applicability, they exhibit fundamental flaws in addressing the superalignment paradigm - notably, the unidirectional imposition of human values cannot accommodate superintelligence's autonomy or ensure AGI/ASI's stable learning. We contend that the values for sustainable symbiotic society should be co-shaped by humans and living AI together, achieving "Super Co-alignment." Guided by this vision, we propose a concrete framework that integrates external oversight and intrinsic proactive alignment. External oversight superalignment should be grounded in human-centered ultimate decision, supplemented by interpretable automated evaluation and correction, to achieve continuous alignment with humanity's evolving values. Intrinsic proactive superalignment is rooted in a profound understanding of the Self, others, and society, integrating self-awareness, self-reflection, and empathy to spontaneously infer human intentions, distinguishing good from evil and proactively prioritizing human well-being. The integration of externally-driven oversight with intrinsically-driven proactive alignment will co-shape symbiotic values and rules through iterative human-ASI co-alignment, paving the way for achieving safe and beneficial AGI and ASI for good, for human, and for a symbiotic ecology.
title	Super Co-alignment of Human and AI for Sustainable Symbiotic Society
topic	Artificial Intelligence
url	https://arxiv.org/abs/2504.17404

Similar Items