OpenAI, the leading artificial intelligence research organization, is launching a new team focused on developing methods to steer and control “superintelligent” AI systems. The team will be led by Ilya Sutskever, OpenAI’s chief scientist and co-founder, along with Jan Leike, a lead on the alignment team at OpenAI. The move comes in response to the prediction that AI systems with intelligence surpassing that of humans could emerge within the next decade.
In a blog post, Sutskever and Leike highlighted the need to research ways to control and restrict potentially superintelligent AI systems, as they may not necessarily be benevolent. Current techniques for aligning AI, such as reinforcement learning from human feedback, rely on human supervision, which becomes unreliable when dealing with AI systems that are significantly smarter than humans.
To address the challenges of controlling superintelligent AI, OpenAI is forming the Superalignment team. This interdisciplinary team will have access to a significant portion of the computing power OpenAI has secured and will focus on solving the technical hurdles associated with controlling superintelligent AI over the next four years.
One of the team’s goals is to develop a “human-level automated alignment researcher” by training AI systems using human feedback and leveraging AI to assist in evaluating other AI systems. The aim is to eventually have AI that can conduct alignment research itself, working in collaboration with human researchers.
While the approach has its limitations and potential risks, OpenAI believes that machine learning experts, even those who are not currently working on alignment, will be critical in solving the problem of superintelligence alignment. The organization plans to share the outcomes of its research widely and considers contributing to the alignment and safety of non-OpenAI models as an important part of its mission.
The formation of the Superalignment team demonstrates OpenAI’s commitment to addressing the complex challenges associated with the development of superintelligent AI and ensuring its safe and responsible deployment in the future.