Saved in:
Bibliographic Details
Main Authors: Zeng, Yi, Zhao, Feifei, Wang, Yuwei, Lu, Enmeng, Yang, Yaodong, Wang, Lei, Liu, Chao, Liang, Yitao, Zhao, Dongcheng, Han, Bing, Tong, Haibo, Liang, Yao, Liang, Dongqi, Sun, Kang, Chen, Boyuan, Fan, Jinyu
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.17404
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911026976391168
author Zeng, Yi
Zhao, Feifei
Wang, Yuwei
Lu, Enmeng
Yang, Yaodong
Wang, Lei
Liu, Chao
Liang, Yitao
Zhao, Dongcheng
Han, Bing
Tong, Haibo
Liang, Yao
Liang, Dongqi
Sun, Kang
Chen, Boyuan
Fan, Jinyu
author_facet Zeng, Yi
Zhao, Feifei
Wang, Yuwei
Lu, Enmeng
Yang, Yaodong
Wang, Lei
Liu, Chao
Liang, Yitao
Zhao, Dongcheng
Han, Bing
Tong, Haibo
Liang, Yao
Liang, Dongqi
Sun, Kang
Chen, Boyuan
Fan, Jinyu
contents As Artificial Intelligence (AI) advances toward Artificial General Intelligence (AGI) and eventually Artificial Superintelligence (ASI), it may potentially surpass human control, deviate from human values, and even lead to irreversible catastrophic consequences in extreme cases. This looming risk underscores the critical importance of the "superalignment" problem - ensuring that AI systems which are much smarter than humans, remain aligned with human (compatible) intentions and values. While current scalable oversight and weak-to-strong generalization methods demonstrate certain applicability, they exhibit fundamental flaws in addressing the superalignment paradigm - notably, the unidirectional imposition of human values cannot accommodate superintelligence's autonomy or ensure AGI/ASI's stable learning. We contend that the values for sustainable symbiotic society should be co-shaped by humans and living AI together, achieving "Super Co-alignment." Guided by this vision, we propose a concrete framework that integrates external oversight and intrinsic proactive alignment. External oversight superalignment should be grounded in human-centered ultimate decision, supplemented by interpretable automated evaluation and correction, to achieve continuous alignment with humanity's evolving values. Intrinsic proactive superalignment is rooted in a profound understanding of the Self, others, and society, integrating self-awareness, self-reflection, and empathy to spontaneously infer human intentions, distinguishing good from evil and proactively prioritizing human well-being. The integration of externally-driven oversight with intrinsically-driven proactive alignment will co-shape symbiotic values and rules through iterative human-ASI co-alignment, paving the way for achieving safe and beneficial AGI and ASI for good, for human, and for a symbiotic ecology.
format Preprint
id arxiv_https___arxiv_org_abs_2504_17404
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Super Co-alignment of Human and AI for Sustainable Symbiotic Society
Zeng, Yi
Zhao, Feifei
Wang, Yuwei
Lu, Enmeng
Yang, Yaodong
Wang, Lei
Liu, Chao
Liang, Yitao
Zhao, Dongcheng
Han, Bing
Tong, Haibo
Liang, Yao
Liang, Dongqi
Sun, Kang
Chen, Boyuan
Fan, Jinyu
Artificial Intelligence
As Artificial Intelligence (AI) advances toward Artificial General Intelligence (AGI) and eventually Artificial Superintelligence (ASI), it may potentially surpass human control, deviate from human values, and even lead to irreversible catastrophic consequences in extreme cases. This looming risk underscores the critical importance of the "superalignment" problem - ensuring that AI systems which are much smarter than humans, remain aligned with human (compatible) intentions and values. While current scalable oversight and weak-to-strong generalization methods demonstrate certain applicability, they exhibit fundamental flaws in addressing the superalignment paradigm - notably, the unidirectional imposition of human values cannot accommodate superintelligence's autonomy or ensure AGI/ASI's stable learning. We contend that the values for sustainable symbiotic society should be co-shaped by humans and living AI together, achieving "Super Co-alignment." Guided by this vision, we propose a concrete framework that integrates external oversight and intrinsic proactive alignment. External oversight superalignment should be grounded in human-centered ultimate decision, supplemented by interpretable automated evaluation and correction, to achieve continuous alignment with humanity's evolving values. Intrinsic proactive superalignment is rooted in a profound understanding of the Self, others, and society, integrating self-awareness, self-reflection, and empathy to spontaneously infer human intentions, distinguishing good from evil and proactively prioritizing human well-being. The integration of externally-driven oversight with intrinsically-driven proactive alignment will co-shape symbiotic values and rules through iterative human-ASI co-alignment, paving the way for achieving safe and beneficial AGI and ASI for good, for human, and for a symbiotic ecology.
title Super Co-alignment of Human and AI for Sustainable Symbiotic Society
topic Artificial Intelligence
url https://arxiv.org/abs/2504.17404